Mellan mardröm & verklighet!: Notes on backing up entire hard disks or partitions

tisdag 28 augusti 2012

Notes on backing up entire hard disks or partitions

It's often useful to make an image of either an entire hard disk or an entire partition. One reason is to duplicate an installed system onto another PC (probably over a network connection); another is to make a backup of your complete hard disk including every aspect of the installed operating systems, which you can restore if you have to replace your hard disk or if you screw things up. Typically it's useful to be able to transfer these images over the network to another machine, although you may want to save images onto a different partition or hard disk.

Commercial tools to do this job include Norton Ghost, Acronis TrueImage (which now seems to have overtaken Ghost in usefulness), and DriveImage (which I believe can't save over the network). Nowadays these tools are quite sophisticated and can even work from within Windows on mounted filesystems and do incremental block-level backups. There is a open source Linux program called partimage which is similar to Ghost, but I prefer to make backups using basic tools which I know will always be to hand, and in a pure format which I understand.
My preferred solution in some situations is to use raw linux commands. The backup technique uses linux, but you don't have to have linux installed on your computer to do this, and you can use this technique to backup partitions containing any filesystem.

IMPORTANT NOTE: I can offer no guarantees whatsoever that the methods detailed on this page are correct or reliable. I've used them on a few occasions without problems, but I have not exhaustively tested this method, and it could be that in some situations it does not perform as expected. Also, these instructions are designed to give pointers and suggestions. They do not comprise step-by-step instructions which you can blindly follow without understanding them. Many of the commands described here are very likely to trash the contents of your hard disk if you don't understand them properly. Use at your own risk!

Copying partitions under linux

When making an image of a disk/partition, you don't want the drive contents changing under you, so the partition(s) must be either unmounted or mounted read-only. The latter possibility means that you can probably drop down to single-user mode and remount essential partitions read-only in order to backup a linux system. If you do this, make sure you have a way of restoring the backup which doesn't depend on already having the OS installed!
A more general method is to boot from CDROM into linux without using the hard disk at all, for example using Knoppix. It's brilliant at autodetecting hardware.
Once you have a linux running, the basic technique is to use the dd command to read the hard disk device (or one of its partitions). The output of dd can be written to a file (perhaps to an external hard disk you have mounted) or piped over the network to another instance of dd on a remote machine. Note that there is a neater way of backing up NTFS partitions: see below.
Example:
dd if=/dev/hda1 bs=1k conv=sync,noerror | gzip -c | ssh -c blowfish user@hostname "dd of=filename.gz bs=1k"
This instructs dd to read the contents of /dev/hda1 (the first partition). conv=sync,noerror tells dd that if it can't read a block due to a read error, then it should at least write something to its output of the correct length. Even if your hard disk exhibits no errors, remember that dd will read every single block, including any blocks which the OS avoids using because it has marked them as bad. So don't be too surprised if dd seems to struggle to read some blocks. (But see the next section for a better way of handling this situation).
bs=1k sets the block size to be 1k. I'm not quite sure what the optimal value is, but it needs to be no larger the the block size for the disk, otherwise a bad block may mask the contents of a good one. 1k is a safe bet.
In the above example the output of dd is piped through gzip to compress it. We then pipe the compressed data stream over an ssh connection to another linux machine (which may also be running Knoppix - see Knoppix notes below). If you wanted to write straight to a local file, you could either just add of=filename to the first dd command (to write an uncompressed image), or if you want to compress it, just redirect the output of the gzip to a filename.
Continuing with our explanation, the -c blowfish option to ssh selects blowfish encryption which is much faster (useful since we're sending tons of data) than the default. Finally another dd command is invoked on the remote machine to read the data stream and write it to a file there. Alternatively you could pipe it through gunzip -c and write it straight to a partition on the remote machine instead of to a file.
Note that, as long as its not compressed, you should be able to mount a file containing a single partition's image using a loopback device in linux. (With a little more jiggery-pokery to find the correct offset, you can also mount partitions within a whole-disk image; see here).

What to do if the disk is damaged and dd takes forever?

Tobias Wolf pointed out "dd_rescue, which deals a lot better with bad blocks than plain dd. A useful helper script to dd_rescue is dd_rhelp, which postpones re-reading bad blocks to the very end because it can be a struggle to retreive something from them".
Steve Holmes reports that dd with conv=sync,noerror doesn't correctly image disks with LVM2 Logical Volumes. I haven't investigated this. He also points out GNU ddrescue ( not the same as dd_rescue mentioned above) which looks useful. According to Steve, ddrescue works finewith LVM2, and some people seem to suggest it's generally superior to dd_rescue.

Restoring partitions

The restore procedure is fairly similar. For example, on the machine with the image on it, you might do something like:
dd if=filename.gz | ssh -c blowfish root@deadhost "gunzip -c | dd of=/dev/hda1 bs=1k". This assumes you have linux (e.g. Knoppix) running on the target machine with an ssh server running. See 'Knoppix tips', below. Note that you should not include conv=sync,noerror in the restore dd - doing so can, in certain situations, corrupt the data being written, since it instructs dd not to wait for more data to arrive from the network or filesystem if a whole block isn't available.
The partition needs to already exist before you do this, and needs to be large enough to take all the data. If it's too big, that doesn't matter, you'll just be wasting space at the end. You should then be able to grow the filesystem to fill that extra space. For ext2 filesystems, try using the ext2resize tool. You may also be able to persuade the partition editing tool parted to do this, since it can handle resizing most filesystems.

Copying an entire hard disk

You can simply use "/dev/hda" as the source (target) to backup (restore) an entire hard disk image, including partition table, MBR and all partitions. This will certainly work if the hard disk being written to is identical to the one the image was made from. I think it will generally also work in other situations as long as the destination disk is larger than the source. I'm not 100% sure that this is always true - partition table entries do contain information in C/H/S (i.e. disk geometry dependent) units, which may be used by the boot loader even if the OS uses LBA. (Links to some gory details about partition table formats are at the end of this page).
If you do make an image of a whole disk, I strongly recommend that you also store extra information about the drive geometry which is necessary in order to interpret the partition table stored within the image, should you need to do that. The most important thing is the cylinder size. Best thing is just to grab a copy of the information fdisk can tell us: fdisk -l /dev/hda > hda_fdisk_information. Keep that file with the image. For good measure, why not get the same information as sfdisk displays it: sfdisk -d /dev/hda > hda_sfdisk_information

Knoppix tips

To become root in knoppix, just use sudo su -
If you are using Knoppix as a destination machine in one of these examples, you'll need to start up its ssh server. A command to do so is on the KDE menus; otherwise /etc/init.d/sshd start (as root) should do the trick. You'll then need to set a password for root so you can login remotely (sudo passwd root).
Knoppix tries to acquire an IP address by DHCP, so if you have a DHCP server on your network you can just find out the IP addresses of the machines (e.g. with ifconfig) and use those in place of hostnames in the ssh commands. If you don't have a DHCP server running on your network (or you're connected two machines directly together by crossover cable), you should be able to manually assign IP addresses using something like ifconfig eth0 192.168.x.y.

Reducing the storage space required

One of the disadvantages of the dd method over software specifically designed for the job such as Ghost or partimage is that dd will store the entire partition, including blocks not currently used to store files, whereas the likes of Ghost understand the filesystem and don't store these unallocated blocks. The overhead isn't too bad as long as you compress the image and the unallocated blocks have low entropy. In general this will not be the case because the emtpy blocks will containing random junk from bygone files. To rectify this, it's best to blank all unused blocks before making the image. After doing that, the unallocated blocks will contain mostly zeros and will therefore compress down to almost nothing.
A slightly clunky way to do this is to mount the partition, then create a file of zeros which fills the entire disk, then delete it again. e.g:
dd if=/dev/zero of=delme bs=8M; rm delme
It's worth doing this regardless of the type of filesystem in use. But don't do it if you suspect the filesystem may be corrupt, as you'll lose the ability to 'recover' lost files.
At the time of writing there is no reliable write support in the Linux NTFS drivers, so you won't be able to use this zeroing technique on an NTFS partition. However, ther are other ways (see below) of efficiently backing up NTFS partitions.

Backing up the MBR (Master Boot Record)

If you save copies of some or all of your partitions individually, and want to be able to use them to restore a working system, you'll also need to backup and restore the MBR and partition table. (The same caveats apply as discussed above about whether partition tables can be restored onto a disk of a different size).

Backing up the MBR:

dd if=/dev/hda of=backup-of-hda-mbr count=1 bs=512
This stores the first 512 bytes of the disk (contianing the MBR and the primary partition info - i.e. the first four primary entries) into the file "bcakup-of-hda-mbr" which you can then copy to somewhere safe.
To restore (be careful - this could destroy your existing partition table and with it access to all data on the disk):
dd if=backup-of-hda-mbr of=/dev/hda
If you only want to restore the actual MBR code and not the primary partition table entires, just restore the first 446 bytes: dd of=/dev/hda if=backup-of-hda-mbr bs=446 count=1. (Those first 512 bytes are 446 bytes of MBR, then 64 bytes of primary partition table).

Backing up the extended partition table

sfdisk -d /dev/hda > backup-hda.sfdisk
(sfdisk is in the util-linux package. I think it's in Knoppix.)
Restore (ditto the above warning):
sfdisk /dev/hda < backup-hda.sfdisk
(then reboot)
I also recommend making a record of the partition table details as displayed in whatever disk partitioning program you like to use, and also as displayed by fdisk -l. Could be handy if you find yourself needing to repartition the disk by hand in preparation for restoring some images of individual partitions.

Linux swap partitions

Somebody asked me how a linux system knows which partitions to use as swap partitions - this is of course an issue if you're restoring a linux partition to a new disk and need to re-create the swap partition for it. The answer is that the swap partition should be listed in /etc/fstab, e.g.:

# <file system> <mount point>   <type>  <options>   <dump>  <pass>          
/dev/hda6       none            swap    sw          0       0

During bootup, one of the init scripts runs swapon -a which activates all swap partitions listed in /etc/fstab.
I imagine that if a swap partition listed in /etc/fstab isn't actually found to be of the correct partition type (or isn't initialised), then the system will ignore it. So it should be safe to boot the restored system even if the swap partition information in /etc/fstab is incorrect, and then amend it to point to a swap partition you've created and run swapon -a.

Backing up and restoring NTFS partitions

The ntfsclone program from the Linux NTFS tools can efficiently clone and restore NTFS partitions without storing the unused space. You can either create a sparse file (if saving an image to a filesystem which supports sparse files), which will appear to have the size of the whole filesystem but will only take up about us much disk space as the used parts of the NTFS filesystem, or you can create a special image file which only contains the used blocks and which ntfsclone can restore later if needed.

Alternative to ssh

If you're on a trusted network, you might think it silly to go to trouble of encrypting and unencrypting all that data with ssh. Yet these days your machines are probably not configured to allow rsh connections. nc (a.k.a. netcat) is one option here: it just sends raw streams of data across a network. On the destination machine you could run nc -l -p 10001 > imagefile to start a process which will listen on TCP port 10001 and dump everthing it receives from the first thing to connect to it to imagefile. Then on the source machine, pipe the output of dd (or gz, or whatever) to nc remote 10001 where remote is the name or IP address of the destination machine.

File-level backup techniques

Backing up entire partitions or disks is most useful when replacating systems across hard disks or when backing up partitions containing operating systems which are otherwise hard to fully backup. For more every-day backups of your data, a file-level method is more appropriate. Here are some handy tools for doing backups over a network to a remote machine in various clever ways:

Random technical details

Anders Lennartsson sent me this information, which doesn't directly affect any of the above notes, but could be useful to know if trying to reconstruct partition tables by hand or whatnot:

Sectors on harddrives normally have 512 bytes each. To find out how many sectors there are in one partition one can use for instance cfdisk. After changing the units by typing a "u" cfdisk reports the number of such sectors. However, for obscure historic reasons, the partition data doesn't start until sector 64. Thus the numbers of sectors obtained by dd when specifying for instance if=/dev/hda1 is the number of sectors seen in cfdisk for this partition, minus 63. I don't remember exactly how (linux) fdisk does report the numbers.

Page about recovering disks containing ext2 filesystems when you don't know the partition table.
Details of how partition tables are stored: here and here.
Program which tries to find partitions even if partition table is damaged: TestDisk
. Mounting partitions within an image of a whole disk using loopback devices in linux

Seb Wills, February 2004. Last amended October 2006. Feedback welcome.

Inga kommentarer:

Skicka en kommentar

Sidor