Archive for the ‘privacy’ Category

scroogle is having a problem

Sunday, July 4th, 2010

I posted a note about scroogle back in January. Scroogle offered an SSL interface to the google engine, and, moreover, didn’t lumber its users with google cookies and sundry other irritations. Since then, however, google themselves have started to offer an SSL interface and, coincidentally, scroogle seem to have started to have some problems.

If you visit the scroogle SSL interface, you get a redirect to a notice which explains why some changes made at google mean that scroogle can no longer work properly. Scroogle managed to get a workaround in place for a few days, but it seems that another google change has finally killed that too unless google can be convinced to help out – unlikely in my view. The scroogle redirect page (dated 1 July 2010) has the following line from Daniel Brandt:

“Thank you for your support during these past five years. Check back in a week or so; if we don’t hear from Google by next week, I think we can all assume that Google would rather have no Scroogle, and no privacy for searchers.”

That in itself is bad enough, but as a separate new posting explains, scroogle now seems to be the target of a botnet aimed at swamping its servers. As Brandt goes on to say:

“Google has a few hundred thousand servers, while Scroogle has six. They can put up with sites that spread malware, but our bandwidth is limited. Even if Google relents and the output=ie interface returns, this Scroogle malware problem could still be increasing at that point. Eventually it alone might shut down Scroogle.”

Sad. I hate to see the little guy lose out.

email address images

Monday, May 3rd, 2010

Adding valid email addresses to web sites is almost always a bad idea these days. Automated ‘bots routinely scan web servers and harvest email addresses for sale to spammers and scammers. And in some cases, email addresses harvested from commercial web sites can be used in targetted social engineering attacks. So, posting your email address to a website in a way which is useful to human being, but not to a ‘bot has to be a “good thing” (TM). One way of doing so is to use an image of an address rather than text itself. Of course this has the disadvantage that the address will not be immediately usable by client email software (unless, of course you defeat the object of the exercise by adding an html “mailto” tag to the image) but it should be no big deal for someone who wants to contact you to write the address down.

There are a number of web sites which offer a (free) service which allows you to plug in an email address and then download an image generated from that address. However, I can’t get over the suspicion that this would be an ideal way to actually harvest valid email addresses, moreover addresses which you could be pretty certain the users did not want exposed to spammers. Call me paranoid, but I prefer to control my own privacy.

There are also a number of web sites (and blog entries) describing how to use netpbm tools to create an image from text – one of the better ones (despite its idiosyncratic look) is at robsworld. But in fact it is pretty easy to do this in gimp. Take a look at the address below:

This was created as follows:

open gimp and create a new file with a 640×480 template (actually any template will do);
select the text tool and choose a suitable font size, colour etc;
enter the text of the address in the new file;
select image -> autocrop image;
select layer -> Transparency -> Colour to Alpha;
select from white (the background colour) to alpha;
select save-as and use the file extension .png – you will be prompted to export as png.

Now add the image to your web site.

what a user agent says about you

Tuesday, March 30th, 2010

I get lots of odd connections to my servers – particularly to my tor relay. Mostly my firewalls bin the rubbish but my web server logs still show all sorts of junk. Occasionally I get interested (or possibly bored) enough to do more than just scan the logs and I follow up the connection traces which look really unusual. I may get around to posting an analysis of all my logs one day.

One of the interesting traces in web logs in the user agent string. Mostly this just shows the client’s browser details (or maybe proxy details) but often I find that the UA string is the signature of a known ‘bot (e.g. Yandex/1.01.001). A good site for keeping tabs on ‘bot signatures is www.botsvsbrowsers.com. But I also find user-agent-string.info useful as a quick reference. If you have never checked before, it can be instructive to learn just how much information you leave about yourself on websites you visit. Just click on “Analyze my UA” (apologies for the spelling, they are probably american) for a full breakdown of your client system.

tor server compromise

Friday, January 22nd, 2010

According to this post by Roger Dingledine, two tor directory servers were compromised recently. In that post Dingledine said:

In early January we discovered that two of the seven directory authorities were compromised (moria1 and gabelmoo), along with metrics.torproject.org, a new server we’d recently set up to serve metrics data and graphs. The three servers have since been reinstalled with service migrated to other servers.

Whilst the direrctory servers apparently also hosted the tor project’s svn and git source code repositories, Dingledine is confident that the source code has not been tampered with – and nor has there been any possible compromise of user anonymity. Neverthless, the project recommends that tor users and operators upgrade to the latest version. Good advice I’d say – I’ve just upgraded mine.

are you /really/ sure you want that mobile phone?

Sunday, January 10th, 2010

The launch of the google nexus one “iPhone killer” reminds me just how prescient Dr Fun’s cartoon of 16 January 2006 (see third cartoon down from the top on the right) really was.

I just love the way the google employee in the video says at the end that Verizon and Vodafone have “agreed to join our program”.

Oh yes indeed.

using scroogle

Saturday, January 2nd, 2010

For completeness, my post below should have pointed to the scroogle search engine which purportedly allows you to search google without google being able to profile you. Neat idea if you must use google (why?) but it still fails the Hal Roberts test of what to do if the intermediate search engine is prepared to sell your data. I actually quite like the scroogle proxy though, particularly in its ssl version because anything that upsets google profiling has to be a good thing. Besides, the really paranoid can simply connect to scroogle via tor.

(Odd that google seem not to have tried to grab the scroogle domain name. If they do, let’s just hope that they get the groovle answer.)

scroogled

Saturday, January 2nd, 2010

One of the more annoying aspects of the web follows directly from one of its strengths. The web is actually designed to make it easy for authors to cross refer to the work of others – hyperlinking is intended to make linking between documents anywhere in web space seamless and transparent. Unfortunately, this cross linking ability leads to many posts (this one included) quoting directly from the source when referring to material elsewhere. In the academic world, quoting from source material is encouraged. When the work is properly attributed to the original author, then this is known as research. Without such attribution it is known as plagiarism.

So whenever I post or write here, I try hard to refer to original source material if I am quoting from elsewhere or I am referring to a particular tool or technique I have found useful. If I am writing about something commented on elsewhere (as for example, Hal Roberts’ discussion of GIFC selling user data in my posting about anonymous surfing), then I will try to link directly to the original material rather than to another article discussing that original. There are fairly good (and obvious) reasons for doing this, not least of which is that the original author deserves to be read directly and not through the (possibly) distorting lens of someone else’s words.

Writing for the web is a very different art to writing for print publication. Any web posting can easily become lazy as the author cross refers to other web posts. Many of those posts may be inaccurate or not primary source material. This can lead to the sort of problem commonly seen in web forums where umpteen people quote someone who said something about someone else’s commentary on topic X or Y. In such circumstances, finding the original, definitive, authoritative, source can be difficult.

Like most people, when faced with this sort of problem I resort to using one or more of the main search engines. But what to search for? Plugging in a simple quote from the original article can often bring up references to unrelated material which happens to include that same (or very similar) phrase. Worse, for reasons outlined above, the search can simply return multiple instances of postings in web fora about the article rather than the article itself. Most irritatingly these days I find that a search will lead to a wikipedia posting – and I just don’t trust the “wisdom of the crowds” enough to trust wikipedia. I’m old fashioned, I like my “facts” to be peer reviewed, authoritative, and preferably written in a form not subject to arbitrary post publication edits. Actually I still prefer dead trees as a trusted source of both factual material and fiction – which is one reason I have lost count of the number of books I have. I also like the reassuring way I can go to my bookshelf and know that my copy of 1984 will be where I left it and in a form in which I remember it.

So when I was researching older articles about Google recently and I wanted to find a copy of Cory Doctorow’s original short fiction piece about Google called “Scroogled” I expected to find umpteen thousand quotes as well as pointers to the original. I was wrong. I originally searched for the phrase “Want to tell me about June 1998?” on the grounds that that would be likely to give me a tighter set of results than simply looking for “scroogled”. This actually gave me fewer that sixty hits on clusty. I was initially reassured that most of the results were simple extracts of the full story with pointers to the original article on radaronline. Even Doctorow’s own blog points to radaronline without giving a local copy of the story. But then I discovered that radaronline no longer lists that article at that URL. Worse, a search of the site gives no results for “scroogled”. So Cory Doctorow’s creative commons licenced short has vanished from the original location and all I can find are copies. This worries me. Perhaps I’m wrong to rely on pointing to original material. What if the original is ephemeral? Or gets pulled for some reason? And if I point to copies, how can I be sure those copies are faithful to the original?

I actually fell foul of this same problem myself a couple of years ago when I was discussing my experiences with BT’s awful home hub router. I wrote in that post a reference to a contribution I made on another forum about my experiments with the FTP daemon on the hub whilst I was figuring out how to get a root shell. That article no longer exists, because the site no longer exists, and I have no copy.

So the web is both vast and surprisingly small and fragile in places.

Oh, just to be on the safe side, I have posted here a local (PDF) copy of scroogled obtained from feedbooks. You never know.

colossally boneheaded

Saturday, December 12th, 2009

David Adams over at OS News has posted an interesting commentary on Eric Schmidt’s recent outburst. Referring to Schmidt’s statement which I commented on below, Adams says “I think the portion of that statement that’s sparked the most outrage is the “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place” part. That’s a colossally boneheaded thing to say, and I’ll bet Schmidt lives to regret being so glib, if he didn’t regret it within minutes of it leaving his mouth. As many people have pointed out, there are a lot of things you could be doing or thinking about that you don’t want other people to be watching or to know about, and that are not the least bit inappropriate for you to be doing, such as using the toilet, trying to figure out how to cure your hemorrhoids, or singing Miley Cyrus songs in the shower.”

The post is worth reading in its entirety.

privacy is just for criminals

Monday, December 7th, 2009

I’ve mentioned before that I value my privacy. I use tor, coupled with a range of other necessary but tedious approaches (such as refusing cookies, blocking ad servers, scrubbing my browser) to provide me with the degree of anonymity I consider my right in an increasingly public world. It is nobody’s business but mine if I choose to research the symptoms of bowel cancer or investigate the available statistics on crime clear up rates in Alabama. But according to Google’s CEO Eric Schmidt, my choosing to do so anonymously makes me at best suspect, and at worst possibly criminal. In an interview with CNBC, Schmidt reportedly said “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place,”

I have been getting increasingly worried about Google’s activities for a while now, but the breathtaking chutzpah of Schmidt’s statement is beyond belief. Lots of perfectly ordinary, law abiding, private citizen’s from a wide range of backgrounds and interests will use Google’s search capabilities in the mistaken belief that in so doing they are relatively anonymous. This has not been so for some long time now, but the vast majority of people just don’t know that. For the CEO of the company providing those services to suggest that a desire for privacy implies criminality is frankly completely unacceptable.

Just don’t use Google. For anything. Ever.

handbags?

Wednesday, October 28th, 2009

It would appear that I may have been unnecessarily concerned about the accuracy of the profiling data held on me by the commercial sites I use. In my inbox today I found the following email from Amazon:

“As a valued Amazon.co.uk customer, we thought you might be interested in visiting our website dedicated to shoes and handbags, Javari.co.uk.

Javari.co.uk offers Free One-Day Delivery, Free Returns, and a 100% Price Match Guarantee.

Welcome to Javari.co.uk”

I don’t know whether to feel reassured at Amazon’s failure to understand me or disappointed that the considerable resource they have at their disposal can get me so wrong.