Off-site backups using Duplicacy and Wasabi

A hand holding a cut-out of a cloud up to a cloud in the sky, symbolizing off-site backups.

I recently had my first ever hard drive failure and have since been looking into off-site backups. There are many choices to look through for off-site backups: what software should I use? What storage provider? This post describes how I set up Duplicacy and Wasabi as my off-site backups solution. Duplicacy is a backup tool written in Go. It supports many back ends including Wasabi, a fast, S3-compatible cloud storage provider. Wasabi offers great pricing that allows you to easily calculate how much it will cost to back up your data. However, combining the two isn’t very clear due to lacking documentation. You’ll leave this post with an understanding of how to set up an off-site backups repository using these two tools.

Duplicacy Background

The first thing that you should understand is that Duplicacy is two separate products. First, there is an open source core and CLI application that you can find on GitHub licenses as free for personal use with a commercial fee. The only documentation you’ll find for that is on the community forum, as they moved all documentation off of GitHub in favor of the forum. Second, there is a commercial product that has a GUI for configuration.

Understandably, the better documentation is for the commercial version instead of the open source version. As such, when you’re looking for documentation on setting up Wasabi and Duplicacy, you’ll often find confusing information that doesn’t map to what you’re doing.

Wasabi Setup

As an aside, Wasabi offers a month-long demo where you can try out their service and upload up to 1TB of data without charge. If you’d like to follow along, feel free to sign up for their demo!

To get started, you will want to create a bucket on Wasabi’s console application. A bucket has the same semantics as an S3 bucket: it’s basically a folder in which you can place objects (read: files) and grant access via their Identity and Access Management (IAM) system. For this blog post, I created a blog-demo bucket on the us-west-1 region. I suspended versioning and logging for the bucket.

Having created the bucket, you want to note the following pieces of information:

The path to your bucket (e.g. https://s3.us-west-1.wasabisys.com/blog-demo)</li>
An IAM access key, which you can reach from the IAM panel</li>
The secret key for your IAM access, which is only accessible on the confirmation screen from creating an access key</li>

Installing Duplicacy

Now that we have the bucket set up, we can configure Duplicacy to back up to the bucket. First, you’ll need to install Duplicacy. You can download the binary from GitHub or look to see if your package manager has a package for it. If you’re like me and use Arch Linux, you can install Duplicacy from an Arch User Repository package.

Duplicacy, like many Go programs, uses a command + subcommand pattern. It does not come with man pages though, so you’re left to try to understand the various flags and commands from their short descriptions from using --help.

Initializing the Repository

The first step to set up Duplicacy is to initialize the repository. From within the directory that you want to back up, issue the initialization command:

duplicacy init <repository_name> <remote_path>

There are a few things that you want to consider here. You can pass the -e flag to encrypt the data on the sending side so you aren’t uploading your bare files as chunks in Wasabi. I opted to do so. When picking a repository_name, I find it easiest to name it the same thing you named your bucket. In this example, it’s blog-demo. The remote_path is the URI for your bucket (with an optional subdirectory if you want to roll that way). Note that you’ll want to specify the wasabi:// protocol if you’re using Wasabi! The resulting command looks something like this:

duplicacy init -e \
  blog-demo \
  wasabi://[email protected]/blog-demo

When prompted for your “Wasabi key,” enter your access key. For your “Wasabi secret,” enter your access secret. When asked for the “storage password for Wasabi,” generate a password using your password manager. You’ll need to enter it twice. At this point, you will have a Duplicacy repository.

If you look in your bucket in Wasabi, you will see a config file. This is a binary file that has some configuration for Duplicacy. On your local file system, you will see a .duplicacy folder that looks like the following:

.duplicacy/
  cache/
  preferences

The preferences file is a JSON document of configuration preferences for your repository. Within this file, there is a keys key, which we will come back to later.

Our First Backup

Let’s do our first backup. To start, run the following (note that you will have to enter your three credentials again):

duplicacy backup -stats

I recommend backing up with the -stats flag because it makes it a lot easier to follow what is going on. Once the backup completes, you will see statistics for how much data Duplicacy sent to the server, along with the revision ID and running time.

Congratulations! You’ve successfully uploaded your first backup to Wasabi. You can see statistics about your repository by running the following:

duplicacy list

If you’re curious how Duplicacy chunked and stored your files, you can run the -files flag on the list command.

Quality of Life Improvements

After you run some more commands, you will notice that you have to enter your credentials every time you run a command. There are a two ways that we can make this less arduous: the preferences file and environment variables.

Preferences

Remember the .duplicacy/preferences file we mentioned when we created the repository? The “keys” key is an object of keys to use when connecting to your repository. By consulting the table of possible values, we see that we can set the following keys:

password is the encryption password used by Duplicacy
wasabi_key is your Wasabi access key
wasabi_secret is your Wasabi access secret

You can store your credentials in this file using the named keys.

Note: This is not a secure way of storing your credentials. The preferences file is a plaintext JSON file so you will be storing your credentials without any encryption. Please be aware of the security characteristics of doing so if you opt for this method.

Environment variables

Like the preferences file, we can pass in our credentials using specific environment variables for each value. Again, we can consult the table of possible values to see what we need. We find the following:

DUPLICACY_PASSWORD is the encryption password used by Duplicacy</li>
DUPLICACY_WASABI_KEY is your Wasabi access key</li>
DUPLICACY_WASABI_SECRET is your Wasabi access secret</li>

Specifying these environment variables when you run a Duplicacy command will set the right credentials. You can also set them in, for example, Systemd units or various other, slightly more secure storage media.

What’s next?

For my current off-site backups solution, I manually run backups. Ideally, I want a set-it-and-forget-it backup solution that runs automatically. I’m happy to rely on the de-duplication system within Duplicacy. Coupled with its pruning abilities, I won’t be overpaying for my backups.

To solve this, I am working on a Systemd setup and set of scripts to make this easier. I will post my full off-site backups setup once I finish it.

Due to Wasabi’s billing characteristics, I plan to follow the recommended pruning setup, which is as follows:

# Keep all snapshots younger than 90 days by doing nothing
# Keep 1 snapshot every 7 days for snapshots older than 90 days
$ duplicacy prune -keep 7:90

# Keep 1 snapshot every 30 days for snapshots older than 180 days
$ duplicacy prune -keep 30:180

# Keep no snapshots older than 360 days
$ duplicacy prune -keep 0:360

What do you use for off-site backups? Have you used Duplicacy, Duplicati, or any other recent backup system?