webanalytics – just say no

I have just built myself a new intel core 2 duo based machine to replace one of my older machines which was beginning to struggle under the load of video transcoding I was placing upon it. The new machine is based on an E8400 and is nice and shiny and fast. Because it is a new build, I decided to install the OS and all my preferred applications, tools and utilities from scratch. Yes, I could have just copied my old setup, or at the least, my home directory and system configuration from my older machine, but I chose to do a completely new clean build on top of a clean install of ubuntu 8.04. I did this largely because my older system has been upgraded and “tweaked” so often I am no longer sure exactly what is on there or why. I am sure that it contains a lot of unnecessary cruft and I felt it was time for a clear out. A new build should ensure that I only installed what I actually needed. Of course I copied over my mail, bookmarks and other personal data, but the applications themselves I simply installed from new and then configured to my preferred standard.

Like most modern linux distros, Ubuntu is pretty secure straight out of the box. Gone are the (good old, bad old) days when umpteen unnecessary services were fired up by init or run out of inetd by default. But old habits die hard and I still like to check things over and stop/remove stuff I don’t want, or don’t trust. I also like to check outbound connections because a lot of programs these days have a habit of “calling home” – a habit I dislike. I noticed and cleared up one or two oddities I’d forgotten about (Ubuntu uses ntpdate to call a canonical server if ntpd is not configured for example. Since I use my own internal ntp server, this was easy to sort). However, after clearing, or identifying all other connections I was left with one outbound http connection I didn’t recognise, and worse, it was to a network I know to be untrustworthy. The connection was to This machine is on the omniture network. Omniture is notorious for running the deeply suspicious 2o7.net. Omniture market webanalytics services and are used by a whole range of (perfectly respectable) companies who pay them for web usage statistics. But omniture have never successfully explained why they choose to use a domain name which looks like, but isn’t, a local RFC 1918 address from the 16 bit block (e.g. I don’t trust them, and I didn’t like the fact that my shiny new machine was connecting to them. So what was responsible? And what to do?

Well, the “what to do” bit is easy – just blackhole the whole – network at my firewall. But that feels a bit OTT, even for me. A bit of thought, and a bit of digging gave me a better solution, and one which incidentally solves a range of related problems. What I actually needed was a way of preventing oubound connections to any hosts I don’t like or don’t trust. So long as the IP addresses of the hosts are not hard coded in the application (as sometimes happens in trojans) the classic way to do this is to simply map the hostname to the local loopback address in your hosts file. But this can become tedious. Fortunately, it turns out that a guy called Dan Pollock maintains a pretty comprehensive hosts file on-line at someonewhocares.org. Result.

Because I run my own local DNS server (DNSmasq on one of the slugs) it was easy for me to add Dan’s host file to my central hosts file. So now all my machines will routinely bin any attempted outbound connection to adservers, porn sites, or whatever in the list. The downside, of course, is that this is a bit of blunt instrument and may cause some difficulty with some sites (ebay for example). But I’m prepared to put up with that whilst I fine tune the list. I can also pull the list regularly and automatically via cron so that I stay up to date (but of course I won’t just blindly update my DNS, I’ll pull the file in for inspection and manual substitution…..).

So what was making the connection? Well it looks to me as if adobe is the culprit. I had installed the acroreader plugin for firefox.

Silly me. Must remember to avoid proprietary software.

(Oh, and you just have to love omniture’s guidance on how to opt-out of their aggregation and analysis. You have to install an opt-out cookie. Oh yes, indeedy, I’ll do that.)

Permanent link to this article: https://baldric.net/2008/09/12/webanalytics-just-say-no/