Introduction

A very good backup tool, that supports deduplication and encryption, is restic. It is open-source, written in go and available for all major platforms. The source code is on github: github.com/restic/restic.

Before restic, I used rsync to do my local backups. This supports kind of de-duplication by using hard-links. So you can create a backup structure, which resembles the full backup data, but for every backup run, only the modified files are added to the backup. The option for this is –link-dest.

I also managed to run this under Windows, see RSync Backup Script for Windows. On my rootserver, I used dirvish, which internally uses rsync to perform the backups.

This means, I got many old backups. And I want to “import” them into a restic repository, to take advantage of deduplication. That’s the goal of this blog post.

Short restic intro

Restic manages the backups in a so-called “restic repository”. This is a directory with a bunch of files, that store the actual data and the metadata. This repository can be stored on a local (or external) disk, but also on services like Amazon’s S3 or similar. Thanks to the built-in encryption, the backups can be stored anywhere without worrying, that someone will take a peek into your backup. There are many supported backends, see the restic documentation.

A restic repository can store backups from multiple different sources - like different computers. The deduplication works across all data. That means, if you backup e.g. the root directory of multiple different linux servers into the same repository, they will benefit strongly from deduplication.

A single backup is called a snapshot. The basic metadata of a snapshot are the timestamp, when the snapshot was created and a hostname. Both can be set manually, which comes in handy when importing old backups. They default to the current time and the current hostname, where restic is running.

There are other commands available, e.g. for removing old snapshots (“forget”).

Another handy feature is the “mount” command, which can fuse-mount the complete repository and you can easily access the backup and restore single files.

Another tool: proot

When you create a new snapshot with restic, you need to provide at least one pathname, that should be backed up, such as /home/jondoe. But with my existing backups, that I created with rsync, I have these available on my external drive e.g. under /media/jondoe/backups/2012-01-01. When I would create new restic snapshot using this path, then all the pathnames contain “/media/jondoe/backups/2012-01-01” - which I don’t want. I want to appear my old backups like I had used restic back at the time and the pathnames in the restic snapshot should use “/home/jondoe”.

In other words, I need somehow make /media/jondoe/backups/2012-01-01 available at /home/jondoe for restic during the backup process. I could use chroot for that, but that would require root privileges. On the search for alternatives, I stumbled across proot.

proot is chroot, but implemented with just user privileges. You can specify a new root directory and then let other programs (such as restic) run in this environment and restic will only see this new root directory. Instead of changing the root directory, single directories can be “bind-mounted”. Usually a mount --bind also needs root privileges, but with proot, it works for normal users.

With proot -b /media/jondoe/backups/2012-01-01:/home/jondoe bash we can start a shell, where the directory /home/jondoe is actually going to the external drive. Instead of the shell, we can start restic.

It’s interesting, how this is technically done: proot intercepts all system calls, that deal with pathnames (such as “stat”) and performs a rewrite of the pathname, and then calls the original system call with the modified pathname. It uses ptrace(2) for intercepting and changing the system call arguments.

Combination

Putting this all together, we can “import” an old backup into a restic repository with the following command line:

jondoe@hostname:~$ proot \
  -b /media/jondoe/backups/2012-01-01:/home/jondoe \
  restic --no-cache --repo /media/jondoe/newbackup/repo backup \
    --time "2012-01-01 12:00:00" --host jondoe \
    /home/jondoe

Note 1: I used here --no-cache, as restic would otherwise create/use a cache directory in your home directory. As we backup our home directory (/home/jondoe) and we have mapped this directory to the actual (old) backup, the cache would actually modify the backup drive…

Note 2: I assume here, that restic is on the PATH and not in /home/jondoe: proot tries to execute restic, and if restic was manually installed e.g. into /home/jondoe/bin/restic, this would not be available anymore, since we mapped /home/jondoe to the backup and the original directory is now not available. But you can additionally bind the “current” version of your home directory to something else and use a full path to the restic binary: Use additionally -b /home/jondoe:/home/jondoe-current and then use the full path for restic binary: /home/jondoe-current/bin/restic. Alternatively, you can put the restic binary alongside your new restic repository on your new external drive.

If you have multiple backup directories, you can convert them in a simple for loop:

for TIMESTAMP in 2012-01-01 2012-02-01 2012-03-01; do \
proot \
    -b /media/jondoe/backups/${TIMESTAMP}:/home/jondoe \
    /media/jondoe/newbackup/restic --no-cache --repo /media/jondoe/newbackup/repo backup \
    --time "${TIMESTAMP} 12:00:00" --host jondoe \
    /home/jondoe; \
done

If you import a backup of a root directory (e.g. a complete backup from of a server), you probably want to bind the root directory. This makes all other directories inaccessible, and you should add a few standard binds, including the location of the restic binary/repo, so that everything still works:

proot \
    -b /media/jondoe/backups/complete-server/2012-01-01:/ \
    -b /proc -b /tmp -b /dev \
    -b /media/jondoe/newbackup \
    /media/jondoe/newbackup/restic --no-cache --repo /media/jondoe/newbackup/repo backup \
    --time "2012-01-01 12:00:00" --host myserver \
    --exclude={/dev,/media,/mnt,/proc,/run,/sys,/tmp,/var/tmp} \
    /

Note: This excludes many directories in root for the backup, most important the whole “/media” directory, so that the backup itself is not backed up.

Conclusion

The combination of the two tools restic and proot makes it possible, to convert or import older backups into restic repositories.

You could use this very same technique to copy snapshots from one restic repository to another by using the “mount” command. E.g. restic -r /media/jondoe/backups/old-restic-repo mount /mp makes the snapshots available as e.g. /mp/hosts/jondoe/2012-01-01T00:00:00Z. Since proot uses the colon (“:”) as a separator for binding, you’ll first need to create a symlink (ln -s /mp/hosts/jondoe/2012-01-01T00:00:00Z /oldbackup) and use that (proot -b /oldbackup:/home/jondoe restic ...). This however will read all data again and is therefore a slow process. You can speed it up a bit by using --ignore-inode, which bypasses some of the file change detection. The inode numbers of a fuse filesystem like the one “restic mount” uses, are not stable shouldn’t be used here.

The main advantage of this slow way is, that the deduplication definitively works, since all data is read again.

But restic itself supports copying snapshots from one repo to another: Copying snapshots between repositories. For data deduplication to work, the target repository needs to have been created with the same chunking parameters, see Ensuring deduplication for copied snapshots.

The proot version, that is included in debian is very old, see tracker.debian.org/pkg/proot and bug report #1037073. The older version doesn’t support all new system calls like “statx”. So you might need to compile your own version of proot, which is easy to do.