Data recovery with ddrescue, testdisk and sleuthkit

From time to time I need to recover data from disks. Reasons can be broken flash/hard disks as well as accidently deleted files. Fortunately, this doesn't happen to often, which on the downside means that I usually don't remember the details about best practice.

Now that a good friend asked me to recover very important data from a broken flash disk, I take the opportunity to write down what I did and hopefully don't need to read the same docs again next time :)

Disclaimer: I didn't take the time to read through full documentation. This is rather a brief summary of the best practice to my knowledge, not a sophisticated and detailed explanation of data recovery techniques.

Create image with ddrescue

First and most secure rule for recovery tasks: don't work on the original, use a copied image instead. This way you can do, whatever you want without risking further data loss.

The perfect tool for this is GNU ddrescue. Contrary to dd, it doesn't reiterate over a broken sector with I/O errors again and again while copying. Instead, it remembers the broken sector for later and goes on to the next sector first. That way, all sectors that can be read without errors are copied first. This is particularly important as every extra attempt to read a broken sector can further damage the source device, causing even more data loss.

In Debian, ddrescue is available in the gddrescue package:

apt-get install gddrescue

Copying the raw disk content to an image with ddrescue is as easy as:

ddrescue /dev/disk disk-backup.img disk.log

Giving a logfile as third argument has the great advantage that you can interupt ddrescue at any time and continue the copy process later, possibly with different options.

In case of very large disks where only the first part was in use, it might be useful to start with copying the beginning only:

ddrescue -i0 -s20MiB /dev/disk disk-backup.img disk.log

In case of errors after the first run, you should start ddrescue again with direct read access (-d) and tell it to try again bad sectors three times (-r3):

ddrescue -d -r3 /dev/disk disk-backup.img disk.log

If some sectors are still missing afterwards, it might help to run ddrescue with infinite retries for some time (e.g. one night):

ddrescue -d -r-1  /dev/disk disk-backup.img disk.log

Inspect the image

Now that you have an image of the raw disk, you can take a first look at what it contains. If ddrescue was able to recover all sectors, chances are high that no further magic is required and all data is there.

If the raw disk (used to) contain a partition table, take a first look with mmls from sleuthkit:

mmls disk-backup.img

In case of a intact partition table, you can try to create device maps with kpartx after setting up a loop device for the image file:

losetup /dev/loop0 disk-backup.img
kpartx -a /dev/loop0

If kpartx finds partitions, they will be made available at /dev/mapper/loop0p1, /dev/mapper/loop0p2 and so on.

Search for filesystems on the partitions with fsstat from sleuthkit on the partition device map:

fsstat /dev/mapper/loop0p1

Or directly on the image file with the offset discovered by mmls earlier. This also might work in case of

fsstat -o 8064 disk-backup.img

The offset obviously is not needed if the image contains a partition dump (without partition table):

fsstat disk-backup.img

In case that a filesystem if found, simply try to mount it:

mount -t <fstype> -o ro /dev/mapper/loop0p1 /mnt

or

losetup -o 8064 /dev/loop1 disk-backup.img
mount -t <fstype> -o ro /dev/loop1 /mnt

Recover partition table

If the partition table is broken, try to recover it with testdisk. But first, create a second copy of the image, as you will alter it now:

ddrescue disk-backup.img disk-backup2.img
testdisk disk-backup2.img

In testdisk, select a media (e.g. Disk disk-backup2.img) and proceed, then select the partition table type (usually Intel or EFI GPT) and analyze -> quick search. If partitions are found, select one or more and write the partition structure to disk.

Recover files

Finally, let's try to recover the actual files from the image.

testdisk

If the partition table recovery was sucessfull, try to undelete files from within testdisk. Go back to the main menu and select advanced -> undelete.

photorec

Another option is to use the photorec tool that comes with testdisk. It searches the image for known file structures directly, ignoring possible filesystems:

photorec sdb2.img

You have to select either a particular partition or the whole disk, a file system (ext2/ext3 vs. other) and a destination for recovered files.

Last time, photorec was my last resort as the fat32 filesystem was so damaged that testdisk detected only an empty filesystem.

sleuthkit

sleuthkit also ships with tools to undelete files. I tried fls and icat. fls searches for and lists files and directories in the image, searching for parts of the former filesystem. icat copies the files by their inode numer. Last time I tried, fls and icat didn't recover any new files compared to photorec.

Still, for the sake of completeness, I document what I did. First, I invoked fls in order to search for files:

fls -f fat32 -o 8064 -pr disk-backup.img

Then, I tried to backup one particular file from the list:

icat -f fat32 -o 8064 <INODE>

Finally, I used the recoup.pl script from Dave Henk in order to batch-recover all discovered files:

wget http://davehenk.googlepages.com/recoup.pl
chmod +x recoup.pl
vim recoup.pl
[...]
my $fullpath="~/recovery/sleuthkit/";
my $FLS="/usr/bin/fls";
my @FLS_OPT=("-f","fat32","-o","8064","-pr","-m $fullpath","-s 0");
my $FLS_IMG="~/recovery/disk-image.img";
my $ICAT_LOG="~/recovery/icat.log";
my $ICAT="/usr/bin/icat";
my @ICAT_OPT=("-f","fat32","-o","8064");
[...]

Further down, the double quotes around $fullfile needed to be replaced by single quotes (at least in my case, as $fullfile contained a subdir called '$OrphanFiles'):

system("$ICAT @ICAT_OPT $ICAT_IMG $inode > \'$fullfile\' 2>> $ICAT_LOG") if ($inode != 0);

That's it for now. Feel free to comment with suggestions on how to further improve the process of recovering data from broken disks.