|It's often useful to make an image of either an entire hard disk or an entire partition. One reason is to duplicate an installed system onto another PC (probably over a network connection); another is to make a backup of your complete hard disk including every aspect of the installed operating systems, which you can restore if you have to replace your hard disk or if you screw things up. Typically it's useful to be able to transfer these images over the network to another machine, although you may want to save images onto a different partition or hard disk.|
Commercial tools to do this job include Norton Ghost, Acronis TrueImage (which now seems to have overtaken Ghost in usefulness), and DriveImage (which I believe can't save over the network). Nowadays these tools are quite sophisticated and can even work from within Windows on mounted filesystems and do incremental block-level backups. There is a open source Linux program called partimage which is similar to Ghost, but I prefer to make backups using basic tools which I know will always be to hand, and in a pure format which I understand.
My preferred solution in some situations is to use raw linux commands. The backup technique uses linux, but you don't have to have linux installed on your computer to do this, and you can use this technique to backup partitions containing any filesystem.
IMPORTANT NOTE: I can offer no guarantees whatsoever that the methods detailed on this page are correct or reliable. I've used them on a few occasions without problems, but I have not exhaustively tested this method, and it could be that in some situations it does not perform as expected. Also, these instructions are designed to give pointers and suggestions. They do not comprise step-by-step instructions which you can blindly follow without understanding them. Many of the commands described here are very likely to trash the contents of your hard disk if you don't understand them properly. Use at your own risk!
Copying partitions under linuxWhen making an image of a disk/partition, you don't want the drive contents changing under you, so the partition(s) must be either unmounted or mounted read-only. The latter possibility means that you can probably drop down to single-user mode and remount essential partitions read-only in order to backup a linux system. If you do this, make sure you have a way of restoring the backup which doesn't depend on already having the OS installed!
A more general method is to boot from CDROM into linux without using the hard disk at all, for example using Knoppix. It's brilliant at autodetecting hardware.
Once you have a linux running, the basic technique is to use the
This instructs dd to read the contents of /dev/hda1 (the first partition).
In the above example the output of dd is piped through gzip to compress it. We then pipe the compressed data stream over an ssh connection to another linux machine (which may also be running Knoppix - see Knoppix notes below). If you wanted to write straight to a local file, you could either just add
Continuing with our explanation, the
Note that, as long as its not compressed, you should be able to mount a file containing a single partition's image using a loopback device in linux. (With a little more jiggery-pokery to find the correct offset, you can also mount partitions within a whole-disk image; see here).
What to do if the disk is damaged and dd takes forever?Tobias Wolf pointed out "dd_rescue, which deals a lot better with bad blocks than plain dd. A useful helper script to dd_rescue is dd_rhelp, which postpones re-reading bad blocks to the very end because it can be a struggle to retreive something from them".
Steve Holmes reports that dd with conv=sync,noerror doesn't correctly image disks with LVM2 Logical Volumes. I haven't investigated this. He also points out GNU ddrescue ( not the same as dd_rescue mentioned above) which looks useful. According to Steve, ddrescue works finewith LVM2, and some people seem to suggest it's generally superior to dd_rescue.
Restoring partitionsThe restore procedure is fairly similar. For example, on the machine with the image on it, you might do something like:
The partition needs to already exist before you do this, and needs to be large enough to take all the data. If it's too big, that doesn't matter, you'll just be wasting space at the end. You should then be able to grow the filesystem to fill that extra space. For ext2 filesystems, try using the ext2resize tool. You may also be able to persuade the partition editing tool parted to do this, since it can handle resizing most filesystems.
Copying an entire hard diskYou can simply use "/dev/hda" as the source (target) to backup (restore) an entire hard disk image, including partition table, MBR and all partitions. This will certainly work if the hard disk being written to is identical to the one the image was made from. I think it will generally also work in other situations as long as the destination disk is larger than the source. I'm not 100% sure that this is always true - partition table entries do contain information in C/H/S (i.e. disk geometry dependent) units, which may be used by the boot loader even if the OS uses LBA. (Links to some gory details about partition table formats are at the end of this page).
If you do make an image of a whole disk, I strongly recommend that you also store extra information about the drive geometry which is necessary in order to interpret the partition table stored within the image, should you need to do that. The most important thing is the cylinder size. Best thing is just to grab a copy of the information fdisk can tell us:
Knoppix tipsTo become root in knoppix, just use
If you are using Knoppix as a destination machine in one of these examples, you'll need to start up its ssh server. A command to do so is on the KDE menus; otherwise
Knoppix tries to acquire an IP address by DHCP, so if you have a DHCP server on your network you can just find out the IP addresses of the machines (e.g. with
Reducing the storage space requiredOne of the disadvantages of the dd method over software specifically designed for the job such as Ghost or partimage is that dd will store the entire partition, including blocks not currently used to store files, whereas the likes of Ghost understand the filesystem and don't store these unallocated blocks. The overhead isn't too bad as long as you compress the image and the unallocated blocks have low entropy. In general this will not be the case because the emtpy blocks will containing random junk from bygone files. To rectify this, it's best to blank all unused blocks before making the image. After doing that, the unallocated blocks will contain mostly zeros and will therefore compress down to almost nothing.
A slightly clunky way to do this is to mount the partition, then create a file of zeros which fills the entire disk, then delete it again. e.g:
It's worth doing this regardless of the type of filesystem in use. But don't do it if you suspect the filesystem may be corrupt, as you'll lose the ability to 'recover' lost files.
At the time of writing there is no reliable write support in the Linux NTFS drivers, so you won't be able to use this zeroing technique on an NTFS partition. However, ther are other ways (see below) of efficiently backing up NTFS partitions.
Backing up the MBR (Master Boot Record)If you save copies of some or all of your partitions individually, and want to be able to use them to restore a working system, you'll also need to backup and restore the MBR and partition table. (The same caveats apply as discussed above about whether partition tables can be restored onto a disk of a different size).
Backing up the MBR:
This stores the first 512 bytes of the disk (contianing the MBR and the primary partition info - i.e. the first four primary entries) into the file "bcakup-of-hda-mbr" which you can then copy to somewhere safe.
To restore (be careful - this could destroy your existing partition table and with it access to all data on the disk):
If you only want to restore the actual MBR code and not the primary partition table entires, just restore the first 446 bytes:
Backing up the extended partition table
(sfdisk is in the util-linux package. I think it's in Knoppix.)
Restore (ditto the above warning):
I also recommend making a record of the partition table details as displayed in whatever disk partitioning program you like to use, and also as displayed by
Linux swap partitionsSomebody asked me how a linux system knows which partitions to use as swap partitions - this is of course an issue if you're restoring a linux partition to a new disk and need to re-create the swap partition for it. The answer is that the swap partition should be listed in /etc/fstab, e.g.:
# <file system> <mount point> <type> <options> <dump> <pass>During bootup, one of the init scripts runs
I imagine that if a swap partition listed in /etc/fstab isn't actually found to be of the correct partition type (or isn't initialised), then the system will ignore it. So it should be safe to boot the restored system even if the swap partition information in /etc/fstab is incorrect, and then amend it to point to a swap partition you've created and run
ntfsclone program from the Linux NTFS tools can efficiently clone and restore NTFS partitions without storing the unused space. You can either create a sparse file (if saving an image to a filesystem which supports sparse files), which will appear to have the size of the whole filesystem but will only take up about us much disk space as the used parts of the NTFS filesystem, or you can create a special image file which only contains the used blocks and which ntfsclone can restore later if needed.
Alternative to sshIf you're on a trusted network, you might think it silly to go to trouble of encrypting and unencrypting all that data with ssh. Yet these days your machines are probably not configured to allow rsh connections.
File-level backup techniquesBacking up entire partitions or disks is most useful when replacating systems across hard disks or when backing up partitions containing operating systems which are otherwise hard to fully backup. For more every-day backups of your data, a file-level method is more appropriate. Here are some handy tools for doing backups over a network to a remote machine in various clever ways:
Random technical detailsAnders Lennartsson sent me this information, which doesn't directly affect any of the above notes, but could be useful to know if trying to reconstruct partition tables by hand or whatnot:
Sectors on harddrives normally have 512 bytes each. To find out how many sectors there are in one partition one can use for instance cfdisk. After changing the units by typing a "u" cfdisk reports the number of such sectors. However, for obscure historic reasons, the partition data doesn't start until sector 64. Thus the numbers of sectors obtained by dd when specifying for instance if=/dev/hda1 is the number of sectors seen in cfdisk for this partition, minus 63. I don't remember exactly how (linux) fdisk does report the numbers.Page about recovering disks containing ext2 filesystems when you don't know the partition table.
Details of how partition tables are stored: here and here.
Program which tries to find partitions even if partition table is damaged: TestDisk
. Mounting partitions within an image of a whole disk using loopback devices in linux
Seb Wills, February 2004. Last amended October 2006. Feedback welcome.