Archive for the ‘coding and admin’ Category

tails in a spin

Thursday, January 12th, 2012

When I first tested running a tails mirror on one of my VMs, the traffic level reported by vnstat ran at around 20-30 GiB per day. I figured I could live with that because it meant that my total monthly traffic would be unlikely to exceed my monthly 1TB allowance. However, when I checked the stats on that server last week (around the 9th of Jan) I found that I was shipping out around 150 GiB per day and vnstat was predicting a monthly total of close to 3 TB. As the tails admins said when I told them that I would have to shut off the mirror on that VM while I sorted something, “Ooops”. Ooops indeed. I couldn’t chance a massive bill for exceeding my bandwidth allowance by quite that much. The actual stats for 4, 5, 6, 7, 8 and 9 January before I pulled the plug were: 34.23 GiB, 69.14 GiB, 178.31 GiB, 131.68 GiB, 99.05 GiB and 133.27 Gib. It turns out that tails 0.10 was released on 4 January and I hadn’t been prepared. A lesson learned.

Having shut down and had the DNS round robin amended, I attended to finding some way of throttling my traffic so that I could live within my allowance whilst still providing a useful mirror. I scratched my head for a while before stumbling on the obvious, I should be throttling at application level. (Sometimes I find that I miss simple answers because I am looking for complicated ones).

I started out by assuming that I should be using tc and iptables mangling, or something like the userspace tool trickle, all of which looked horribly more complicated than the approach taken by tor (which allows you to simply set the acceptable bandwidth rate to some limit, plus set an accounting period maximum of some total transfer limit per day/week whatever). And of course it turns out that my webserver (lighttpd) allows something similar. Just set the server limit to some chosen max transfer rate and, if necessary, also impose a per IP max rate. The magic configuration file options are:

# limit server throughput to 3000 kbytes/sec (~30000 kbits/sec)
server.kbytes-per-second = 3000
#
# and limit individual connections to 50 kbytes (~500 kbits/sec) – NB. I don’t actually use this
# connection.kbytes-per-second = 50

I tested this by pulling a copy of the tails iso from one of my other VMs which has a high bandwidth connection and got acceptable (and expected) results. So now I can go back on-line later this month safe in the knowledge that I’m not going to blow all my bandwidth in one week.

tunnelling X over ssh

Monday, December 19th, 2011

OK, yes, I know there are probably already a gazillion web pages on the ‘net explaining exactly how to do this, but I got caught out by a silly gotcha when I tried to do this a couple of days ago, so I thought I’d post a note.

Firstly, X is not exactly a secure protocol, nor is it easy to filter at NAT firewalls, so the ability to tunnel it over ssh is hugely welcome. In fact, ssh can be used to tunnel practically any other protocol you care to name, so it should be your first port of call should you wish to connect to a remote system using an insecure protocol. (I use it to wrap rsync for example).

I don’t run X on my VMs (there is no need, they don’t run desktop software) and I had not previously seen the need to run X based graphical programs on those servers. However, a couple of days ago I thought it would be really useful to run etherape on one particular remote server so that I could watch the traffic patterns. Normally I use iptraf (which is ncurses based) when I want to monitor network traffic in real time, but etherape is pretty cool and gives a nice graphical view of your network connections. But it runs on an X based gui.

So. I changed the remote server’s sshd_config to enable X forwarding (“X11Forwarding no” becomes “X11Forwarding yes”) and restarted sshd. On my desktop I similarly changed my local ssh_config file to allow X forwarding (“ForwardX11 no” becomes “ForwardX11 yes”) to obviate the need to use the -X switch on the command line. I then installed etherape on the remote server and fired it up only to get the message “Error: no display specified”. Sure enough “echo $DISPLAY” showed nothing. But I had thought (and everything I had read confirmed) that ssh should take care of setting the appropriate display when X11 forwarding was set.

So I then tried setting a display manually (export DISPLAY=localhost:10.0 on the remote server) and then got the response “Error: cannot open display: localhost:10.0″. So, still no deal. I spent some time scratching my head (and reading man pages) and sent off a query to my local Linux User group in parallel asking for advice. They were gentle with me.

The first, and rapid, response, said:

On the server:

sudo apt-get install xauth

Then disconnect and reconnect the client.

Jobs a good un.

Thank you Brett.

So the moral is, make sure that you have X authorisation working properly on the remote system (check for the existence of $HOME/.Xauthority) if you experience the same symptoms I did.

clarity is not a virtue

Friday, October 14th, 2011

Picking up my copy of the second edition I was reminded of the old obfuscated c contests. One of the earliest (anonymous) entries was this tribute to K&R’s famous printf(“hello world\n”);

——————————————————————————

int i;main(){for(;i["]<i;++i){–i;}”];read(‘-’-'-’,i+++”hell\

o, world!\n”,’/'/’/'));}read(j,i,p){write(j/p+p,i—j,i/i);}

—————————————————————————–

c is not known as flexible for nothing.

that looks like a scam to me

Monday, October 10th, 2011

The volume of spam backscatter I am receiving at the baldric.net domain currently runs at around 18-20,000 emails per month, nearly all of which is aimed at the info@ address I have mentioned before.

My mail server is currently configured to reject mail to non-existent users at the SMTP level with a permanent failure message like so: “550 5.1.1 : Recipient address rejected: User unknown in virtual mailbox table:” Rejecting mail at this stage, rather than accepting it only to bounce it later is the “correct thing to do”. This way the sending MTA gets a failure message in its logs and the mail it was attempting to send to me never leaves its queue. If the mail admin at the other end is in any way savvy, then he or she is given enough information to allow them to investigate and, perhaps, cure the problem. But of course that assumes two things: one she /is/ savvy; and two, she actually cares enough to do anything.

Now there is nothing I can do about the second problem, but if there is any way I can provide additional information which might help the hard pressed admin understand why they might have a problem, then that would aid them, me, and any of the likely hundreds or thousands of other people out there who will be receiving crud in response to mails they didn’t send.

One possible way forward might be to add some additional information to the SMTP rejection message – something along the lines of “hey, you might have a configuration problem here, please consider investigating”. Now I dislike re-inventing wheels (and I’m lazy) so I spent a short while searching for possible modifications to my own postfix configuration which would do the trick. Sure enough, I quickly discovered backscatterer.org and its suggested modification to main.cf (though note that it assumes that postfix is using the dbm database library – not all of them do, particularly on the default debian install). Hey, that looks cool, so if I modify my configuration slightly I will be able to run a lookup against backscatterer’s DNSRBL and in cases of a hit I will send an SMTP reject message that looks like this: “554 5.7.1 Service unavailable; Client host [217.77.96.18] blocked using ips.backscatterer.org; Sorry 217.77.96.18 is blacklisted at http://www.backscatterer.org/?ip=217.77.96.18;” instead of the much less informative message above. Now the sysadmin at mx2.infopac.ru (217.77.96.18) will get a much more useful log message. Won’t they?

But hold on a moment, where does backscatterer.org get its RBL? Can I trust it? And am I being fair on the sending domain if I block all mail coming from there based on the simple fact that they are listed in some third part RBL? That feels a little like SORBS to me. Turn the question around. Would I, as admin for the baldric.net domain (and a dozen others) be happy if mail from my domain to some servers were blocked because I had chosen to implement something like “sender callouts” (unlikely as that might be). Worse, backscatterer.org “offers” to de-list any server from its database if you pay them 85 euros (OK, so that will only be about threepence halfpenny in a few weeks time when the eurozone finally tanks, but it is still extortion, whatever the actual sum).

So I think I’ll stay away from backscatterer – it looks like a scam to me. I’ll just have to find another way of telling my Russian sysadmin friends that their servers are “misconfigured”.

squeezing the slugs

Monday, September 26th, 2011

Debian 6 (squeeze) has been the current stable version since February 2011. The latest version (6.02) was released in late June. I have put off updating my slugs from lenny (old stable) for a while because I wanted to see how others faired before committing myself. Indeed, initial reports on the debian arm list indicated that the upgrade could be problematic. Worse, a completely clean install of squeeze turns out to be impossible because the debian installer uses more memory than is physically available on the slugs. So the only way to go, even for a clean new installation, is to install lenny first, then upgrade.

Given that both my slugs are operational, and are now an integral part of my network, I decided to invest in a new one as a development machine to test the upgrade rather than risk fritzing a perfectly good setup. (Back in the day I would have been happy to “fix it ’till it broke”, but these days I don’t really need to experiment that much and I’d rather keep a working system, well, working).

Second hand slugs go for around £25 on ebay, and there are still plenty about, so I bid for one that had only about a day to go and was successful. Unfortunately, when it turned up I found that the power supply was fsckd and so I had to switch off one of my operational slugs in order to test the new one. Happily it appeared to boot up OK so I fired off a disgruntled email to the seller and then ordered a new PSU. The seller claimed that it “worked OK when I boxed it” and didn’t offer to pay for the replacement PSU so I wasn’t too happy with him. I became even less happy when the new PSU arrived and I booted up the slug in preparation for reconfiguration to match my network before installing debian.

The debian installation process is handled via an SSH shell. You need to know the address of the slug in order to connect and install. The installer also needs the addresses of a local DNS server and the default route to the outside world (so it can find the servers containing the installation packages). Now the default, factory settings, for slugs includes a fixed IP address of 192.168.1.77. If this does not match your requirements, it must be changed before reflashing with debian. Guess what? The default address didn’t work, so the previous owner must have reconfigured the slug to match his network and he had not bothered to reset to factory default before selling. Nor had he been considerate enough to let me know the new configuration. Needless to say I won’t be buying anything else from him. Nor did he get decent feedback.

I couldn’t reconfigure the new slug until I could connect so I needed to find out what address it was using. A quick nmap scan of the 192.168.1.0/24 netblock showed that it wasn’t even on the default network range so I fired up wireshark and etherape on one of my machines in the hope of catching the slug arping and getting the address from the request. In the event, etherape proved to be quicker (and easier) in providing the answer since the slug quickly popped up and disclosed its IP address as 192.168.2.10. Adding a route to the 192.168.2.0/24 net then allowed me to finally connect and reconfigure the new beast to suit my network. I then rebooted and started a fresh installation of lenny (as previously described in one of my earlier posts). About four hours later I had a nice new clean slug running lenny.

However, since the main purpose of getting the new slug was to allow me to test the upgrade to squeeze in safety I really needed to make it look like my main operational slug. That slug is my DNS and DHCP server, and primary rsync backup for my desktops. It also runs a webserver. Like most (lazy) sysadmins my system documentation tends to lag somewhat behind reality so I can’t rely on the various readme files I routinely create on my boxes to be completely up to date (or even accurate). Fortunately for me though, debian provides a neat way of snapshotting installed packages on a system. You can then use this snaphot to create a mirror of that system which will include all the same packages. Here’s how:

On the source system:

dpkg –get-selections | grep -v deinstall > packages.txt

This lists all active packages, except those deinstalled, and sticks the list in a text file.

Now copy that file to the target system, ensure that the target system’s “sources.list” file matches that on the source, and then run:

dpkg –clear-selections
dpkg –set-selections < packages.txt
apt-get dselect-upgrade

This will download and install all the packages necessary to get the target system matching the source.

All that is now left to do is copy across any relevant configuration files so that the two systems fully match and then rebooot the target to check that everything looks OK.

New slug now finally matching old slug it was time to upgrade to squeeze. Martin Michalmayr’s excellent website documents the upgrade process in meticulous detail. The key points to note here are his recommendation that you read the release notes for debian 6.0. In particular, note and follow the chapter on upgrades from debian 5.0 before attempting an actual upgrade. One of the main differences between 5.0 and 6.0 is the use of UUIDs to reference disks. In my case this meant changing my /etc/fstab from this:

# /etc/fstab: static file system information.
#
#
proc /proc proc defaults 0 0
/dev/sda2 / ext3 errors=remount-ro 0 1
/dev/sda1 /boot ext2 defaults 0 2
/dev/sda5 none swap sw 0 0

to this:

# /etc/fstab: static file system information.
#
#
proc /proc proc defaults 0 0
UUID=db57451a-e3e5-4d8a-95b9-494c48bb5e8d / ext3 errors=remount-ro 0 1
UUID=022bc211-1c52-4848-9ee1-e211e72b28e4 /boot ext2 defaults 0 2
/dev/sda5 none swap sw 0 0

Before finally starting the upgrade I opened two separate SSH sessions to the slug. In one I ran the upgrade process as below:

first a partial upgrade as recommended at Section 4.4.4, “Minimal system upgrade”

apt-get update
apt-get upgrade

then install the required linux kernel image and udev as outlined in Section 4.4.5. “Upgrading the kernel and udev”

apt-get install linux-image-2.6-ixp4xx
apt-get install udev

followed by

reboot

and

apt-get upgrade
apt-get dist-upgrade

to complete the system upgrade.

Now here is where the second SSH session is most useful. The final upgrade and distribution upgrade installs the file indexing package “apt-xapian-index”. Correspondents on the debian arm list have noted that this package consumes more memory than is available on the poor old slug and it starts swapping itself to death. The process must be killed immediately and the package removed. If you leave it too long after the upgrade has completed you will find it impossible to log in until the initial indexation has completed (in excess of 24 hours or more has been reported) because the system is too busy. I ran “top” in the second shell during the dist-upgrade process and kept an eye on the load averages. As soon as they started climbing above 3 I knew that it was time to watch out for the apt-xapian-indexer and kill it. Once the system load returned to normal I was then able to finalise the upgrade with:

apt-get purge apt-xapian-index
apt-get autoremove

to remove the offending indexer and clean up any residual unneeded packages. A final reboot to check all was well was sufficient to convince me that it was safe to upgrade my two operational slugs using the same process. Testing the upgrade on the new slug in the way I did also meant that I now had a backup slug configured exactly like my main DNS server but running squeeze. Any failure on the remaining upgrade would not then be critical.

I must be getting old. I never used to be this cautious.

no police here

Tuesday, February 1st, 2011

The UK Home Office launched a new crime statistics website today at www.police.uk. The site is supposed to show “Local crime and policing information for England and Wales”.

I’m not entirely convinced of the merit of the site in the first place (and can see all sorts of potential objections arising in some of the more rabid tabloid newspapers), but I thought I would try it out before making any form of judgement of my own. Unfortunately I’m not impressed.

The opening page of the new service invites the user to “Enter your postcode, town, village or street into the search box below, and get instant access to street-level crime maps and data, as well as details of your local policing team and beat meetings.”

I have tried various combinations of the suggestions, scaling outwards and upwards from my precise postcode to the whole of that part of the County in which I live. I was not reassured to get the following message:

screenshot of www.police.uk website

Discussion elsewhere on the ‘net suggests that this result is not unusual. It appears to be a badly worded (or badly coded) response to an error condition resulting from system overload following the launch. At least I sincerely hope that is the case and we are not really completely devoid of policing services in the whole of South Norfolk.

Examination of the HTML source for the webpage generated suggests that the service is running on Amazon’s Web Services. Certainly some of pages are retrieved from S3 servers, and the IP address of the site appears to be on Amazon’s AWS (see dig and whois results below *). If the site is, as it appears to be, cloud based, then either the supplier (Rock Kitchen Harris, Leicester) or the Home Office has seriously undersized the requirement. Regardless of who is at fault here, there is an evident need to pull in some more resource pretty quickly. This should be a good test of the much vaunted flexibility of cloud based services such as Amazon’s EC2. I expect the service to be running quickly and cleanly by this time tomorrow.

* dig www.police.uk returns:

; <<>> DiG 9.6-ESV-R3 <<>> www.police.uk
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10557
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.police.uk. IN A

;; ANSWER SECTION:
www.police.uk. 1251 IN CNAME policeuk-167782603.eu-west-1.elb.amazonaws.com.
policeuk-167782603.eu-west-1.elb.amazonaws.com. 60 IN A 46.137.113.146

;; Query time: 268 msec
;; SERVER: 80.68.80.24#53(80.68.80.24)
;; WHEN: Tue Feb 1 14:16:23 2011
;; MSG SIZE rcvd: 107

and a whois lookup of 46.137.113.146 returns:

% Information related to ’46.137.0.0 – 46.137.127.255′

inetnum: 46.137.0.0 – 46.137.127.255
netname: AMAZON-EU-AWS
descr: Amazon Web Services, Elastic Compute Cloud, EC2, EU