touching update

I have recently upgraded the internal disk on my main desktop from 1TB to 2TB. I find it vaguely astonishing that I should have needed to do that, but I do have a rather large store of MP4 videos, jpeg photos and audio files held locally. And disk prices are again coming down so the upgrade didn’t cost too much. One noticable improvement following the upgrade is the reduction in noise. The disk I chose is one of Western Digital’s “Green Desktop” range which is remarkably quiet. Thoroughly recommended. But the point of this post is the consequence of the upgrade.

In order to minimise disruption after installation of the new disk (and of course a fresh install of Mint) I simply slotted the new disk into a spare slot in the PC chassis and hooked it up to the SATA port used by the old disk. I then hooked the old disk up to a separate spare SATA port. (I could, of course, have changed the boot order in BIOS to achieve the same effect.) Having installed Mint, I then rebooted the machine from the old disk for one final check of my old configuration before copying my data from old-disk/home/mick to new-disk/home/mick. Despite the fact that my data occupied over 900GB, the copy went reasonably quickly and painlessly – one of the advantages of a disk to disk copy over SATA, even if it is only SATA 2.0 (3Gb/s) – (Note to self. Next build should include SATA 3.0).

However, what happened next certainly wasn’t quick. In my haste to copy my old data and get back to using my PC, I stupidly forgot to preserve the file attributes (-p or -a switch) in my recursive cp. This meant of course that all my files on the new disk now had a current date attached to them. Worse, I didn’t immediately notice until I came to backup my desktop to my NAS. I do this routinely on a daily basis using rsync in a script like so:

/usr/bin/rsync -rLptgoDvz –stats –exclude-from=rsync-excludes /home/mick nas:/home/mick-backup

Guess what? Since all my desktop files now had a current modification time, rsync seemed to want to recopy them all to the NAS. This was going to take a /lot/ longer than a local cp. So I killed it so that I could figure out what had gone wrong (that didn’t take long when I spotted the file timestamps) and could find a simple fix (that took longer).

Now I had thought that rsync was smart enough to realise that the source and destination files were actually the same, regardless of the file timestamp change. Realising that I didn’t actually know what I /thought/ I knew about rsync I explained to colleagues on the ALUG mailing list what I had done and sought advice. They didn’t laugh (well not publicly anyway) and a couple of them offered very helpful suggestions to sort my problem. Wayne Stallwood first pointed out that “the default behaviour for rsync is to only compute file differences via the checksums if the modification dates between source and destination indicate the file has changed (source is newer than the destination) So it’s not actually recopying everything (though if things like permissions have changed on the source files they will now overwrite those on the target and naturally the timestamps will be updated). Previously when you ran your backup it would have just skipped anything that had the same or older timestamp than the target.”

So, what I saw as a possibly very long re-copy exercise was actually rsync comparing files and computing checksums. Needless to say, that computation on a fairly low powered NAS was going to take a long time anyway. And besides, I didn’t /want/ the timestamps of my backups all changed to match the (now incorrect) timestamps on my desktop. I wanted the original timestamps restored.

Then Mike Dorrington suggested that I could simply reset the timestamps with a “find -exec touch” approach. As Mike pointed out, touch could be made to use the timestamp of one file (in this case my old original file on the 1TB disk) to update the timestamp of the target file without any need for copying whatsover. But I confess I couldn’t at first see how I could get touch to recurse down one filetree whilst find recursed down another. Mark Rogers put me out of my misery by suggesting the following:

cd /old-disk/home/mick

find . -exec touch /new-disk/home/mick/\{\} –reference=\{\} –no-dereference –no-create \;

I think this is rather elegant and it certainly would not have occurred to me without prompting.

Permanent link to this article:

1 comment

    • Peter on 2013/03/05 at 7:32 pm

    Interesting – I too had a disk change at hand. II tend to get nervous when I hear bearings, and I just had the impression the Seagate in my Macbook Pro was getting too loud for comfort.

    At least, that’s my excuse and I’ll stick to it – it happened to coincide with a significant drop in price of a replacement I had my eye on for quite some time: the Seagate Momentus XT hybrid drive. It’s a 750GB harddisk with an additional 8GB SSD cache which the drive automatically maintains by analysing what you use most often (I noticed a report in The Register today which explains why the price dropped: Seagate is bringing out a new range).

    Basically, you get a drive that for normal OS use responds faster once it has learned what you do, at a price only marginally above a non “enhanced” drive.

    I’ve now used for a couple of weeks, and I can only recommend it. It goes like greased wombats.

    The only snag was that I did not have a Mac which could support two disks, but I did have a USB 2 cradle. I resolved to doing that restore overnight, because it’s otherwise equivalent to watching paint dry…

Comments have been disabled.