Backups and things

Backups are important. They might not save your life, but they will keep your data safe.

Posted by Eddo on October 27, 2018

My data is important, I try to have a simple workflow to keep my data safe in a lot of situations. The scenarios that I’d like to cover by my data archive and backup workflow include data being safe when my laptop and hard drives are stolen or destroyed, or when my apartment is destroyed.

I am a programmer that works with version control for any project I’m working on, which is pushed to a remote Github repository. So I know those projects are safe. Though what about my many photographs and other data? I have an archive on one 5TB bus-powered drive, which is sufficient capacity for the coming one or two years. Maybe three, depending on how strict I’m in culling photos and cleaning up of data (i.e. digital clutter). Even though this drive usually doesn’t leave my apartment, it is still encrypted to protect against theft. Using encrypted APFS might also help me with data integrity (some more reading about this).

All my data is currently on encrypted APFS drives to protect my data in the case it is stolen. The device is then simply lost, and I don’t have to worry about my data being accessible by others.

My archive is backed up to a second 5TB bus-powered drive, which usually doesn’t live in the same physical location as my primary archive. I only bring it home on certain occasions (at least once every month). Using rsync scripts I backup the data. This might change, as I’ve read that there are better applications for it that include data consistency checks (like ChronoSync). The archive is also backed up to an Amazon S3 Glacier bucket in another country, via the macOS tool Arq. Why use the AWS Glacier setup? I don’t expect to need the data there very quickly, and for a lot of storage this is a cost effective solution 💰 for me. I know there are other cloud-backup providers that have a one-stop solution, though they will remove data if a drive isn’t plugged in for a period of about 30 days. I don’t want that to happen when I’m on a month-long trip. Data is encrypted locally before it is transmitted to the S3 bucket, so that both in transit and remote data is not usable by someone else. This setup gives me three copies of my data (i.e. the original and two backups), on two separate media (i.e. bus-powered HDD and cloud storage), of which at least one is remote. This adheres to the proven 3-2-1 principle of backups.

I like the portability of bus-powered drives, as it gives me freedom in being able to take it with me or use it on the sofa. I’ve always found it a hassle with an additional power plug, as I currently don’t have a fixed desk at home. The drawback is storage capacity and performance, though I don’t care too much about the latter as I don’t actively work of those drives.

I also backup my laptop, in a slightly different workflow. Every week I make a bootable clone of my laptop with SuperDuper to a backup drive (the same that contains the archive backup). I also used Apple’s Time Machine as a backup, for when I might want to retrieve a file I accidentily had deleted. Though in the past years I’ve never had a need for it, so ditched it. My personal data (my /Users/eddo folder in macOS) is also backed up to an Amazon S3 Glacier bucket using Arq.

When I’m on a photography assignment with a laptop, I always bring an external 1TB Samsung T3 SSD with me as data backup for the assignment. That drive is small enough to bring with me, just put the SSD in my pocket, and the laptop in the backpack. If my pack gets lost or stolen, I at least have the data with me.

I never delete data from my memory cards, until I’m back home and the data is duplicated to at least the regular off-site backup drive.

I used to travel without a laptop, and didn’t have a good way to backup photos then. I recently backed the Gnarbox 2.0 SSD Kickstarter campaign, so I can use that as a backup solution. It is a small device that lets me plugin a memory card, and automatically copies the data. I’ll copy the data from a memory card to the Gnarbox at the end of the day. Another improvement I’ll make is that my next camera has two card slots, and let the camera write to both cards. It hasn’t happened yet that a memory card failed, though when I’m getting paid for creating photographs, I’d better be able to deliver. In that situation I will also have three copies of my data, on two different media (memory cards and Gnarbox), of which one is always with me and the other inside my photo pack.

Recovery

Having your data backed is one thing, being able to recover from your backup solutions is a second . Your backups might as well not exist if you can’t recover your data from them. In order to test my backups, I’ll boot the bootable backup of my laptop every so often to see if that runs. At times I’ll also randomly check files in the archive backup. I haven’t yet tried to recover data from my Amazon S3 Glacier buckets, though I should actually try that. What if my apartment is destroyed while I am backing up the data to the off-site drive? I’d like to know that I can recover from that. Is that a likely scenario? Probably not, though you might never know 🤷‍♂️.

Final thoughts

This is my workflow that works for me at the moment, yet always open for improvements. I’ve been thinking about a NAS, though I don’t have a need for it as I don’t need always-on availability of my archive, and I’ve gone away from desktop drives that require an additional power plug.

P.S. If you’ve enjoyed this article or found it helpful, please share it, or check out my other articles. I’m on Instagram and Twitter too if you’d like to follow along on my adventures and other writings.