Archive for the ‘privacy’ Category

tor server compromise

Friday, January 22nd, 2010

According to this post by Roger Dingledine, two tor directory servers were compromised recently. In that post Dingledine said:

In early January we discovered that two of the seven directory authorities were compromised (moria1 and gabelmoo), along with metrics.torproject.org, a new server we’d recently set up to serve metrics data and graphs. The three servers have since been reinstalled with service migrated to other servers.

Whilst the direrctory servers apparently also hosted the tor project’s svn and git source code repositories, Dingledine is confident that the source code has not been tampered with – and nor has there been any possible compromise of user anonymity. Neverthless, the project recommends that tor users and operators upgrade to the latest version. Good advice I’d say – I’ve just upgraded mine.

are you /really/ sure you want that mobile phone?

Sunday, January 10th, 2010

The launch of the google nexus one “iPhone killer” reminds me just how prescient Dr Fun’s cartoon of 16 January 2006 (see second cartoon down from the top on the right) really was.

I just love the way the google employee in the video says at the end that Verizon and Vodafone have “agreed to join our program”.

Oh yes indeed.

using scroogle

Saturday, January 2nd, 2010

For completeness, my post below should have pointed to the scroogle search engine which purportedly allows you to search google without google being able to profile you. Neat idea if you must use google (why?) but it still fails the Hal Roberts test of what to do if the intermediate search engine is prepared to sell your data. I actually quite like the scroogle proxy though, particularly in its ssl version because anything that upsets google profiling has to be a good thing. Besides, the really paranoid can simply connect to scroogle via tor.

(Odd that google seem not to have tried to grab the scroogle domain name. If they do, let’s just hope that they get the groovle answer.)

scroogled

Saturday, January 2nd, 2010

One of the more annoying aspects of the web follows directly from one of its strengths. The web is actually designed to make it easy for authors to cross refer to the work of others – hyperlinking is intended to make linking between documents anywhere in web space seamless and transparent. Unfortunately, this cross linking ability leads to many posts (this one included) quoting directly from the source when referring to material elsewhere. In the academic world, quoting from source material is encouraged. When the work is properly attributed to the original author, then this is known as research. Without such attribution it is known as plagiarism.

So whenever I post or write here, I try hard to refer to original source material if I am quoting from elsewhere or I am referring to a particular tool or technique I have found useful. If I am writing about something commented on elsewhere (as for example, Hal Roberts’ discussion of GIFC selling user data in my posting about anonymous surfing), then I will try to link directly to the original material rather than to another article discussing that original. There are fairly good (and obvious) reasons for doing this, not least of which is that the original author deserves to be read directly and not through the (possibly) distorting lens of someone else’s words.

Writing for the web is a very different art to writing for print publication. Any web posting can easily become lazy as the author cross refers to other web posts. Many of those posts may be inaccurate or not primary source material. This can lead to the sort of problem commonly seen in web forums where umpteen people quote someone who said something about someone else’s commentary on topic X or Y. In such circumstances, finding the original, definitive, authoritative, source can be difficult.

Like most people, when faced with this sort of problem I resort to using one or more of the main search engines. But what to search for? Plugging in a simple quote from the original article can often bring up references to unrelated material which happens to include that same (or very similar) phrase. Worse, for reasons outlined above, the search can simply return multiple instances of postings in web fora about the article rather than the article itself. Most irritatingly these days I find that a search will lead to a wikipedia posting – and I just don’t trust the “wisdom of the crowds” enough to trust wikipedia. I’m old fashioned, I like my “facts” to be peer reviewed, authoritative, and preferably written in a form not subject to arbitrary post publication edits. Actually I still prefer dead trees as a trusted source of both factual material and fiction – which is one reason I have lost count of the number of books I have. I also like the reassuring way I can go to my bookshelf and know that my copy of 1984 will be where I left it and in a form in which I remember it.

So when I was researching older articles about Google recently and I wanted to find a copy of Cory Doctorow’s original short fiction piece about Google called “Scroogled” I expected to find umpteen thousand quotes as well as pointers to the original. I was wrong. I originally searched for the phrase “Want to tell me about June 1998?” on the grounds that that would be likely to give me a tighter set of results than simply looking for “scroogled”. This actually gave me fewer that sixty hits on clusty. I was initially reassured that most of the results were simple extracts of the full story with pointers to the original article on radaronline. Even Doctorow’s own blog points to radaronline without giving a local copy of the story. But then I discovered that radaronline no longer lists that article at that URL. Worse, a search of the site gives no results for “scroogled”. So Cory Doctorow’s creative commons licenced short has vanished from the original location and all I can find are copies. This worries me. Perhaps I’m wrong to rely on pointing to original material. What if the original is ephemeral? Or gets pulled for some reason? And if I point to copies, how can I be sure those copies are faithful to the original?

I actually fell foul of this same problem myself a couple of years ago when I was discussing my experiences with BT’s awful home hub router. I wrote in that post a reference to a contribution I made on another forum about my experiments with the FTP daemon on the hub whilst I was figuring out how to get a root shell. That article no longer exists, because the site no longer exists, and I have no copy.

So the web is both vast and surprisingly small and fragile in places.

Oh, just to be on the safe side, I have posted here a local (PDF) copy of scroogled obtained from feedbooks. You never know.

colossally boneheaded

Saturday, December 12th, 2009

David Adams over at OS News has posted an interesting commentary on Eric Schmidt’s recent outburst. Referring to Schmidt’s statement which I commented on below, Adams says “I think the portion of that statement that’s sparked the most outrage is the “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place” part. That’s a colossally boneheaded thing to say, and I’ll bet Schmidt lives to regret being so glib, if he didn’t regret it within minutes of it leaving his mouth. As many people have pointed out, there are a lot of things you could be doing or thinking about that you don’t want other people to be watching or to know about, and that are not the least bit inappropriate for you to be doing, such as using the toilet, trying to figure out how to cure your hemorrhoids, or singing Miley Cyrus songs in the shower.”

The post is worth reading in its entirety.

privacy is just for criminals

Monday, December 7th, 2009

I’ve mentioned before that I value my privacy. I use tor, coupled with a range of other necessary but tedious approaches (such as refusing cookies, blocking ad servers, scrubbing my browser) to provide me with the degree of anonymity I consider my right in an increasingly public world. It is nobody’s business but mine if I choose to research the symptoms of bowel cancer or investigate the available statistics on crime clear up rates in Alabama. But according to Google’s CEO Eric Schmidt, my choosing to do so anonymously makes me at best suspect, and at worst possibly criminal. In an interview with CNBC, Schmidt reportedly said “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place,”

I have been getting increasingly worried about Google’s activities for a while now, but the breathtaking chutzpah of Schmidt’s statement is beyond belief. Lots of perfectly ordinary, law abiding, private citizen’s from a wide range of backgrounds and interests will use Google’s search capabilities in the mistaken belief that in so doing they are relatively anonymous. This has not been so for some long time now, but the vast majority of people just don’t know that. For the CEO of the company providing those services to suggest that a desire for privacy implies criminality is frankly completely unacceptable.

Just don’t use Google. For anything. Ever.

handbags?

Wednesday, October 28th, 2009

It would appear that I may have been unnecessarily concerned about the accuracy of the profiling data held on me by the commercial sites I use. In my inbox today I found the following email from Amazon:

“As a valued Amazon.co.uk customer, we thought you might be interested in visiting our website dedicated to shoes and handbags, Javari.co.uk.

Javari.co.uk offers Free One-Day Delivery, Free Returns, and a 100% Price Match Guarantee.

Welcome to Javari.co.uk”

I don’t know whether to feel reassured at Amazon’s failure to understand me or disappointed that the considerable resource they have at their disposal can get me so wrong.

tor on a vps

Sunday, July 5th, 2009

I value my privacy – and I dislike the increasing tendency of every commercial website under the sun to attempt to track and/or profile me. Yes, I know all the arguments in favour of advertising, and well targeted advertising at that, but I get tired of the Amazon style approach which assumes that just because I once bought a book about subject X, I would also like another book about almost the same subject. I don’t much like commercial organisations profiling me (and, incidentally, I find it highly ironic that we in the UK seem to make a much bigger fuss about potential “big brother” Government than we do about commercial data aggregation, but hey).

Sure, I routinely bin cookies, block adware and irritating pop up scripts, and use all the, now almost essential, firefox privacy plugins, but even there we still have a problem. I don’t know who wrote those plugins, I just have to trust them. That worries me. Some of the best known search engines are even more scary if you think carefully about the aggregate information they have about you.

Sometimes I care about the footprint I leave, sometimes I don’t, but the point is that I should be in control of that footprint. Increasingly that is becoming difficult. Besides being tracked by sites I visit, last year’s controversy about BT’s use of phorm is also worrying. If my ISP can track everything I do, then I face another level of difficulty in protecting my fast vanishing privacy.

Besides using a locked down browser, DNS filtering which blocks adware, cutting cookies and all the other tedious precautions I now feel are necessary to make me feel comfortable, I often use anonymous proxies when I don’t want the end site to know where I came from. But even that now looks problematic. If you use a single anonymising proxy, all you are doing is shifting the knowledge about your browing from the end site to an interrmediary. That intermediary may (indeed should) have a very strict security policy. Ideally, it should log absolutely nothing about transit traffic. But if that intermediary does log traffic data and then sells that data to a third party, you may be in an even worse position than if you had not attempted to become anonymous. Back in january of this year, Hal Roberts of Harvard University, posted a blog item about GIFC selling user data. If sites such as Dynaweb are prepared to sell user data, then the future for true anonymity looks problematic. As Doc Searle said in this blog posting,

We live in a time when personalized advertising is legitimized on the supply side. (It has no demand side, other than the media who get paid to place it.) Worse, there’s a kind of gold rush going on. Even in a crapped economy, a torrent of money is flowing into online advertising of all kinds, including the “personalized” sort. No surprise that companies in the business of fighting great evils rationalize the committing of lesser ones. I’m sure they do it it the usual way: It’s just advertsing! And it’s personalized, so it’s good for you!

No, as Searle well knows, it is not good for you.

What to do? Enter tor and privoxy.

I first used tor some years ago in its earlier incarnation as “the onion router” (hence its name) and until recently had used it only sporadically since. The main drawback of the early tor network was its speed, or lack of it. Tor gets is strength (anonymity) from the way it routes traffic.

how tor works

Tor traffic passes through a series of nodes before exiting at a node which cannot be linked back to the original source. So tor performance depends on a large number of both fast intermediate relays and a large number of exit nodes. Since not all tor users are prepared to run relays, let alone an exit node (it can be bandwidth expensive and in the case of an exit node can lead to your system being mistaken for a hostile, or compromised, site) tor can be slow, at times painfully slow. But recently tor has been getting faster as more relay and exit nodes are added. It is now at a state which is probably usable most of the time, so long as you are prepared to wait a little longer than is customary for some web pages to load (and you don’t use youtube…..).

When using tor recently I have tended to follow the well trodden path of local installation alongside privoxy. Because I believe in giving something back to the community if I am gaining benefit, I also set my local configuration to run as a relay. But that caused some difficulty. If we assume that my tor usage was fairly representative of the majority of tor users out there, then the fact that my relay was only operational when my client system was up and running meant that the relay would be seen by the tor network as unstable and probably slow, Indeed, the fact that I had to throttle tor usage to the minimum to stop the network from impacting unduly on my ADSL bandwidth, meant that I was not entirely happy with the setup. So I stopped relaying. But that leaves me feeling that I am taking advantage of a free good when I could be contributing to that good.

Some while back I bought myself a VPS from Bytemark (an excellent, technically savvy, UK based hosting company) to run a couple of webs and an MTA. I use it now largely as a mail server (running postfix and dovecot) and the traffic is relatively low volume. That VPS is pretty small (though actually way better specced than some real servers I have run in the past) but I reasoned that I could easily run a tor relay on that machine and then connect to it remotely from my client system. I did, and it worked fine. But I soon found that the tor network seems to have a voracious appettite for bandwidth, Even with a fairly strict exit policy (no torrents allowed!) and some tight bandwidth shaping, I still found that I was using about 2 Gig of traffic per day (vnstat is useful here). Any more than that would start to encroach on my bandwidth allowance for my VPS and possibly impact on the main business use of that server. Monthly rates for VPSs are now less than I pay for my mobile phone contract (and arguably more useful than a phone contract too) so I decided to specialise and buy another VPS just for tor. I now run an exit node on a VPS with 384 MB of RAM and 150 Gig monthly traffic allowance. That server is currently throttled to about 2 Gig of traffic per day, but I will double that very shortly.

Now one of the nicest things about running a tor relay is the fact that your own tor usage is masked and you may get better anonymity. I therefore run privoxy on my tor relay and proxy through from my client to that proxy which in turn chains to tor internally on my relay. However, if you simply configure your local client to proxy through to your relay in clear you are allowing your ISP (and anyone else who cares to look) to see your tor requests – not smart. So I tunnel my requests to the tor relay through ssh. My local client has an ssh listener which tunnels tor requests through to the relay and connects to privoxy on port 8118 bound to localhost on the relay. I also have a separate browser on my desktop which has as its proxy the ssh listener on my client system. For a good description of how to do this see tyranix’s howto on the tor wiki site. Now whenever I want to use tor myself I simply switch browser (and that browser is particularly dumb and stripped, and has no plugins or toolbars which could leak information). Of course, should I get really paranoid, I could always run the local browser in a VM on my desktop and reload the VM after each session.

But I’m not that paranoid.

webanalytics? just say no.

Friday, September 12th, 2008

I have just built myself a new intel core 2 duo based machine to replace one of my older machines which was beginning to struggle under the load of video transcoding I was placing upon it. The new machine is based on an E8400 and is nice and shiny and fast. Because it is a new build, I decided to install the OS and all my preferred applications, tools and utilities from scratch. Yes, I could have just copied my old setup, or at the least, my home directory and system configuration from my older machine, but I chose to do a completely new clean build on top of a clean install of ubuntu 8.04. I did this largely because my older system has been upgraded and “tweaked” so often I am no longer sure exactly what is on there or why. I am sure that it contains a lot of unnecessary cruft and I felt it was time for a clear out. A new build should ensure that I only installed what I actually needed. Of course I copied over my mail, bookmarks and other personal data, but the applications themselves I simply installed from new and then configured to my preferred standard.

Like most modern linux distros, Ubuntu is pretty secure straight out of the box. Gone are the (good old, bad old) days when umpteen unnecessary services were fired up by init or run out of inetd by default. But old habits die hard and I still like to check things over and stop/remove stuff I don’t want, or don’t trust. I also like to check outbound connections because a lot of programs these days have a habit of “calling home” – a habit I dislike. I noticed and cleared up one or two oddities I’d forgotten about (Ubuntu uses ntpdate to call a canonical server if ntpd is not configured for example. Since I use my own internal ntp server, this was easy to sort). However, after clearing, or identifying all other connections I was left with one outbound http connection I didn’t recognise, and worse, it was to a network I know to be untrustworthy. The connection was to 66.235.133.2. This machine is on the omniture network. Omniture is notorious for running the deeply suspicious 2o7.net. Omniture market webanalytics services and are used by a whole range of (perfectly respectable) companies who pay them for web usage statistics. But omniture have never successfully explained why they choose to use a domain name which looks like, but isn’t, a local RFC 1918 address from the 16 bit block (e.g. 192.168.112.207). I don’t trust them, and I didn’t like the fact that my shiny new machine was connecting to them. So what was responsible? And what to do?

Well, the “what to do” bit is easy – just blackhole the whole 66.235.128.0 – 66.235.159.255 network at my firewall. But that feels a bit OTT, even for me. A bit of thought, and a bit of digging gave me a better solution, and one which incidentally solves a range of related problems. What I actually needed was a way of preventing oubound connections to any hosts I don’t like or don’t trust. So long as the IP addresses of the hosts are not hard coded in the application (as sometimes happens in trojans) the classic way to do this is to simply map the hostname to the local loopback address in your hosts file. But this can become tedious. Fortunately, it turns out that a guy called Dan Pollock maintains a pretty comprehensive hosts file on-line at someonewhocares.org. Result.

Because I run my own local DNS server (DNSmasq on one of the slugs) it was easy for me to add Dan’s host file to my central hosts file. So now all my machines will routinely bin any attempted outbound connection to adservers, porn sites, or whatever in the list. The downside, of course, is that this is a bit of blunt instrument and may cause some difficulty with some sites (ebay for example). But I’m prepared to put up with that whilst I fine tune the list. I can also pull the list regularly and automatically via cron so that I stay up to date (but of course I won’t just blindly update my DNS, I’ll pull the file in for inspection and manual substitution…..).

So what was making the connection? Well it looks to me as if adobe is the culprit. I had installed the acroreader plugin for firefox.

Silly me. Must remember to avoid proprietary software.

(Oh, and you just have to love omniture’s guidance on how to opt-out of their aggregation and analysis. You have to install an opt-out cookie. Oh yes, indeedy, I’ll do that.)