How to recover from a corrupt Deja Dup backup, and set up an alternative

I recently started using the automated backup system Deja Dup which makes it very easy to set up automated, encrypted, incremental backups using the Duplicity back-end.
This package has been installed by default on Ubuntu since version 11.10, so I expected it to be thoroughly tested and reliable by now. Unfortunately my backups became corrupted which meant that some files were lost, and it was quite difficult to restore others. This article will explain how I recovered my files, then set up a faster, more reliable alternative that's still encrypted.

I used Deja Dup for a few days and it appeared to be working normally and claimed to have made several backups for me successfully;
I formatted my hard-drive to do a fresh install of my OS, I went to restore my data with Deja Dup, but it then died near the start of the restoration process, giving this error:


invalid data - SHA1 hash mismatch for file:
duplicity-full.20140508T105537Z.vol2.dia href="#duplicity-sources">fftar.gz
Calculated hash: 8ae69af39a566823309fae86142ae3a2af16358d
Manifest hash: 6a332f406b0842f229e2122921c0e4c97c4f76bd

I tried running the process again several times, and tried the three different backup snapshots that were available, but the same problem kept occurring.
I was starting to panic by this stage so I looked at the documentation to see if I could get more information about what was going wrong, and use the command line to get more control over what Duplicity was doing. I recommend reading the documentation listed at the bottom of the page if you have any doubts about what I'm doing here.

I used this command to get a list of backups: duplicity collection-status

Next, you can attempt to restore a backup hopefully showing where the corrupted files are...
To do this I chose which backup I wanted to list by specifying the backup time.
duplicity collection-status (above) lists time-stamps for each backup like this Sat Apr 26 02:57:20 2014 but the documentation states that the commands require me to convert the timestamp into this format: 2014-5-26T02:57:20 (yyyy-m-ddThh:mm:ss)

To attempt to restore the whole backup to test_restore/ on my desktop I ran:
duplicity restore --time "2014-5-26T02:57:20+01:00" file:///home/ben/Desktop/backup/ /home/ben/Desktop/test_restore

I then needed to list files in this backup so we can attempt to avoid the corrupted ones while restoring.
This command will output a list of all files in your backup which can be very long, so had it output to a text file instead:
duplicity list-current-files --time "2014-5-26T02:57:20+01:00" > backup_files.txt

I made a basic shell script to restore each folder in turn, but avoid touching the corrupt files. The corrupt files in my backup happened to be in /home/ben/Downloads/torrents/ so I had to specify each non-corrupt file/folder within torrents/ in turn. All of the other folders had no corrupt files so I could just specify the top-level folders and Duplicity restored all of their contents recursively without problems.


#!/bin/sh

export PASSPHRASE='my-super-secret-backup-passphrase'

duplicity -v5 --file-to-restore home/ben/Downloads/torrents/games file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/Downloads/torrents/games
duplicity -v5 --file-to-restore home/ben/Downloads/torrents/xkcd-volume0-high.pdf file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/Downloads/torrents/xkcd-volume0-high.pdf
duplicity -v5 --file-to-restore home/ben/Fonts file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/Fonts
duplicity -v5 --file-to-restore home/ben/games file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/games
duplicity -v5 --file-to-restore home/ben/Kitson file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/Kitson
duplicity -v5 --file-to-restore home/ben/Music file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/Music
duplicity -v5 --file-to-restore home/ben/Pictures file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/Pictures
duplicity -v5 --file-to-restore home/ben/scripts file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/scripts
duplicity -v5 --file-to-restore home/ben/Videos file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/Videos
duplicity -v5 --file-to-restore home/ben/VirtualBox file:///home/ben/Desktop/backup/ /home/ben/Desktop/restore/VirtualBox

unset PASSPHRASE

With any luck all of the specified files will be recovered now.

Setting up an alternative backup solution

I now use an external hard drive with an encrypted Ext4 partition, then use a simple shell script that runs rsync to update my backups. I haven't got this automated as I don't have the external drive plugged in much, but now it's set up all I have to do is plug in the drive and run ./scripts/rsync-backup.sh in the terminal.

If you have Ubuntu, search for and run the Disks app. If you have Xubuntu or a different distro like me that doesn't come with it, install the gnome-disk-utility package.

Be very careful with this tool as you could easily wipe all of your data if you select the wrong disk/partition.

Select the drive/partition that you want to re-format and use as an encrypted backup drive. Click the More actions button, then Format...

Select the option shown - 'Encrypted, compatible with Linux systems (LUKS + Ext4)'

Choose a label for the partition and set an encryption key. Make a note of this password or you won't be able to mount the disk!

When formatting is complete, mount the drive and enter the password you just set. Without this password no-one can access your encrypted drive. Ubuntu will probably offer to remember the password for you in future.

Next, make a simple shell script to run the rsync command for you. Make a plaintext file and make it executeable in file properties.

Then enter the contents of the script, which for me looks like this:


#! /bin/sh
rsync -axh --progress --delete /home/ben/ /media/ben/ben_backup_drive/

Now you can run the script in the terminal by doing something like this: ./scripts/rsync-backup.sh. It will take a long time to copy all of your files the first time, but in future it will be faster as it will only be syncing changes.

Documentation
https://help.ubuntu.com/community/DuplicityBackupHowto
http://duplicity.nongnu.org/duplicity.1.html

Comments

Clément - Sat, 01/07/2017 - 23:45

Permalink

Hello
I felt better reading your solution to resolve this Deja-dup's failure. Well, as I am not good for programing, I would like to know if you could help restoring my HD files...

I did last backup, well conclude, I reinstalled Ubuntu 16.04, but finally I just could restore a little part of my files...I have the same description as you...my restoring stop with the [...]vol59[...] file :

invalid data - SHA1 hash mismatch for file:
duplicity-full.[...].dia href="#duplicity-sources">fftar.gz
Calculated hash: 8ae69af39a566823309fae86142ae3a2af16358d
Manifest hash: 6a332f406b0842f229e2122921c0e4c97c4f76bd

I have no ideia what to do to save my data...precious old pictures and others important files...

Regards,

Clément (clement.vialle@gmail.com)

PS: my english is not good (I am french living in Brazil), but I think I can understand enough...

En Français:

Salut Clément! Je n'ai pas personnellement utilisé Déjà Dup, mais je crois que vous devez lister les fichiers dans la sauvegarde en utilisant la commande duplicity list-current-files --time "2014-5-26T02:57:20+01:00" > sauvegarde_fichiers.txt, où 2014-5-26T02:57:20+01:00 est l'horodatage de la sauvegarde que vous souhaitez restaurer. Ensuite, vous pouvez essayer de restaurer les fichiers les plus importants. Si vous êtes chanceux, les fichiers corrompus ne seront pas parmi ceux que vous souhaitez récupérer.

Je suis desolé, mais nous ne offrons pas un support gratuit par email. Si vous avez besoin de plus d'aide, contactez-nous pour obtenir un devis. Bon chance!

In English:

Hi Clément! I haven't personally used Déjà Dup, but I think that you need to list the files in the backup using the command duplicity list-current-files --time "2014-5-26T02:57:20+01:00" > backup_files.txt, where 2014-5-26T02:57:20+01:00 is the timestamp of the backup that you want to restore. After that, you can try to restore the most important files. If you are lucky, the corrupt files will not be among the ones that you want to recover.

I'm sorry, but we don't offer free email support. If you need further help, please contact us for a quote. Good luck!

Add new comment

CAPTCHA