How I Manage My Data

Posted on | 1022 words | ~5mins

In this article I share bit of what I’ve learned in putting together a backup and data synchronization system for myself and my family. My goal is simple enough to state generally: I want to make sure all of my notes, documents, photos and videos are backed up and available from anywhere. Diving into the details of this goal is where things get complex. Happily, I think the end result is simple enough for others to emulate.

Gathering the requirements

Teasing apart my goal a bit more, I refined it to the following requirements:

  • Every document from every device should be backed up to redundant, cloud storage. Losing a device should not mean losing data, nor should losing all of my devices at the same.
  • I’d like immediate access to some documents from all devices with transparent synchronization. Some documents, like in-progress notes and drafts and to do lists need to be up to date at all times on all devices.
  • Photos are a bit special. I’d like the ability to view and search through every photo and video in my entire library from desktop and mobile devices. In addition, many photos need to be shared with others.
  • I’d like for my personal data to be encrypted at rest and in transport.
  • I’m lazy and forgetful: Whatever system I put together should be easy to maintain and mostly invisible.
  • I need to store about 100GB of documents and 500GB of media.

Media is special

The special case of photos caught my attention, and I thought a bit more about how I take, organize, access, and share pictures. This led to a few more requirements:

  • An easy, consistent workflow is important to me. Everything captured from either my phone or DSLR should end up in the same places.
  • Edits to photos and meta-data on any platform should be synchronized and backed up.
  • As mentioned above, easily sharing photos with friends and family is important to me, but in general my pictures are private.
  • I favor searching over ongoing organization. Searching for places, dates, people, and tags should be easy to do.
  • I usually take pictures in RAW format, and edit to produce variants of the original. Having a photo systems that understands this is important to me.

Finally, I am our family’s IT administrator. Some of this project will support their needs, so the solutions need to be relatively simple and inexpensive.

Considering the options

Now that I understand the problem I’m trying to solve, it’s time to consider the tools available to construct a solution.

Assets

I have some assets already available that could potentially form part of a solution:

  • A Mac mini running at my home that is always on.
  • An older but perfectly good Drobo NAS at home.
  • An Amazon Prime account.
  • An offsite, small virtual private server at buyvm.net.

(spoiler: my end solution uses only the first of these).

Services

For keeping a set of folders of data in sync across all of my devices, I looked at Google Drive, Dropbox, and Syncthing. For backup software and services, I evaluated Backblaze, Carbonite, and Arq Backup using Backblaze B2 cloud storage. Finally, I looked at various photo services, including Flickr, Google Photos, Smugmug, and Amazon Prime Photos.

Data Synchronization

Both Google Drive and Dropbox are easy to setup and use. I already use Dropbox for work, so adding it for my personal documents is trivial. The same can generally be said for Google Drive. Both of these services also have excellent mobile applications for accessing the files. However both fail my encrypted at rest requirements. There are tools available to build encrypted filesystems on top of these services, but they complicate the final product. In the end, I chose to use Syncthing and not store my documents in a cloud (though they are backed up to the cloud). Syncthing lets me control the storage – I use my Mac mini server as the “cloud,” and it’s contents are backed up continuously to a cloud service. Syncthing can be a bid tricky to setup and manage – certainly more so than Dropbox or Google Drive. If it proves too complex, and encrypted storage is still important, I would look to SpiderOak

Backups

Backblaze’s backup solution seemed great at first analysis, but the costs become prohibitive when scaled to my entire family. However Backblaze does offer a service called “B2” which is similar to but cheaper than S3. I selected Arq Backup with Backlaze B2 as the offsite, geo-redundant storage. B2 is cheap, fast, and easy to use, and Arq is cheap enough, light-weight, and easy to configure and manage on all my systems. Most importantly, it is easy to manage on all of my family computers, and can be configured to be bandwidth-friendly.

My annual backup bill for offsite backup of all of my data went from $120/yr with CrashPlan to less than $50 with Backblaze B2.

Media

All of my media is stored on my Drobo NAS at home, managed by Mylio, which I like, but don’t love as a photo management application. The photos and metadata are completely backed up to Backblaze B2 as described above. Mylio is installed on all of my devices (include my phone), and keeps the various systems in sync – all photos end up in original quality on my server, and are accessible from my laptop and phone as needed.

Mylio is adequate for face detection and geolocation cataloging, but Google Photos is just too impressive to ignore. I’m using the Google Photos Backup tool from my server mac mini to upload all of my photos to a private Google account. This violates my encrypted-at-rest requirement, but I can live with that for now. Using Google Photos, I get amazing search, and good limited sharing.

Putting it all together

My daughter was kind enough to draw a diagram of the whole system:

diagram of backup and sync system

Mylio takes care of photo syncing, and Syncthing handles all other documents. Arq+Backblase B2 handle offsite backups, while Google Photos enables media sharing and searching. I’m not sure what role the duck plays, but it seems to be important.

Comments on Twitter