Dropbox is great, but it's too small and not very private.
To improve the situation I finally found the perfectly reliable tool I was waiting for: Unison. As so often, it was not so much a new tool that solved my problem but a new mindset. What put me off Unison for a while is that you have to trigger the sync yourself. It's not fully automatic.
After tinkering with various automatic solutions I realized that manual sync is actually a feature. No notifications to distract you. No big sync hogging bandwidth when you want to watch Netflix. Make a mistake on one device? You can still make safety copies of files on another device and then
sync. File not there? Just sync and you get instant confirmation whether there is something wrong. And, best of all: Unison is very, very stable. And Open Source. For private data, that's just mandatory.
Setup pi server
So to get started with unison you set up your raspberry pi with no-ip.org or another dynamic DNS provider and open the SSH port. Then install unison, it's right in the archives.
Also, set up passwordless SSH logins (beyond the scope of this post).
# on your pi
sudo apt-get install unison
Then designate one folder in your home directory to syncing. I just called it "me", but you can of course call it anything, including "douchebox" or "sync-this-shit". With a DIY dropbox, the possibilities for expressing yourself are endless.
Then hook up some kind of permanent storage medium on your pi. I believe an SSD is best for reliability, even though you will never ever use its speed advantages. I find 128G are almost the same price for SSD or USB stick. But anything without moving parts will likely work pretty well for home online storage. Then mount it, and create your sync directory.
# on your pi
sudo mkdir /path/to/storage/me
# not strictly required but sensible
sudo chmod 700 /path/to/storage/me
Setup your computers
Next, you install unison on every computer you want to sync, creating a star topoligy with your pi at the center. Again, just like dropbox.
# on each of your linux computers
# windows instructions below
sudo apt-get install unison
Well, you're good to go! Sync!
# on each of your linux computers
unison ~/me ssh://subdomain-of-my-pi.no-ip.org//path/to/storage/me
... and that's all there is to it.
Unison will ask you to compare everything he first time, so it helps when one of your sync directories is empty. But if you're migrating from some other sync, you can just do the same thing, and unison will tell you what it's going to do so you can confirm.
When unison is not sure, you will be prompted for every file what to do: press dot to copy from remote, comma to push to remote. This is much less tedious than it sounds, since it will likely only happen the first time you run things, and after that in the rare event that you update a file on more than one end. In any case, you see exactly what changes are being synced, so you can cancel and make backups if something looks fishy. You can also ditch this safety net and go with what unison thinks is best with the -batch switch. I personally prefer interactive, after long winding uncertainty brough on by flaky automatic sync solutions.
Windows (ugh) computers
If you have the ill fortune to use Windows you can use unison anyway. But it's a bitch to install. But, it will give you CygWin, which will instantly upgrade your Windows experience from terrible to bearable, so that's something. It will also stay out of your way if you don't need it.
(1) Install CygWin - Tell the installer to keep its packages in c:\users\my-user\Downloads\cygwin and install things to c:\cygwin for the most sane setup.
(2) Add cygwin to your path; right click "My Computer", go to "Advanced Settings", and edit the "Environment Variables". Add c:\cygwin\bin to your path, to have the full cygwin toolchain in your windows shell.
(3) Open a windows shell, generate an SSH key and set up passwordless SSH to your server (beyond the scope of this post, but works just like on linux or osx, use ssh-keygen in the windows shell)
(4) Install the GTK runtime from Pidgin.
(5) Install Unison for Windows
(6) Launch the Unison GUI and create a profile to sync (involves mainly typing in the local and remote directory), or use the command line the same way as on linux
Yep, quite a few steps! No wonder unison isn't more widely used. But what's strange is that there is no easy installer tool. But then, the Windows tools that do use unison probably don't say so in order to tie you to their backup space.
Having a cloud server at home is great, until your home goes up in smoke. This shouldn't happen, but it could, and if it does, you will need your private data to get your life back on track.
The cheapest way by far to get an off site backup is Amazon Glacier. It has kind of funky restore rules that could turn out super expensive, but if you know what you're doing you can restore cheap enough. I decided to simply use glacier and work out exactly how to restore cheaply when/if I need it, having decided it's possible to use without too much cost
From what I gather glacier is made from robotic tape drives. Excellent, just what I need. Also glacier is supposed to handle bit rot for me. Even more excellent.
The best tool I could find to backup to glacier is called mtglacier. It is a joy to use. It complements unison perfectly because it also works on the sync principle: It will only push changed files to glacier. Caveat: Unlike unison it only uploads whole files, so if you have big files you change a lot you might want consider the efficiency. OTOH, inbound glacier traffic is free, but I try not to create incentive for that to change.
mtglacier comes with a nice Debian repository (and lots of others, check out the readme).
# on your pi
wget -O - http://mt-aws.com/vsespb.gpg.key | sudo apt-key add -
echo "deb http://dl.mt-aws.com/debian/current wheezy main"|sudo tee /etc/apt/sources.list.d/mt-aws.list
sudo apt-get update
sudo apt-get install libapp-mtaws-perl
After that, you need to create a glacier vault. I use one vault per sync directory in order to minimize damage in case a script nukes a vault by mistake (this has never happened before but robustness is still nice).
I created mine directly in the Amazon Web Services console. I created a new IAM user and assigned full Glacier privileges for that vault to that with the policy generator. Then I created an mtglacier config file.
# region: eu-west-1, us-east-1 etc
# protocol=http (default) or https
Note I went for AWS protocol to prevent eavesdropping in transit, but I am trusting Amazon with my private data
. Super sensitive data such as passwords go into an encfs encrypted folder, for all semi-private stuff such as insurance files etc. I consider Amazon server side encryption good enough. I don't think the hassle of encrypting everything with the PI's CPU power and the added threat of losing my keys are worth the extra security for ordinarily private files like bills or even contracts
that are not particularly attractive for others to steal. They're encrypted with keys stored separately in a darn robotic tape archive for chrissake, so there are the limits of my paranoia. Really sensitive stuff should go into an encrypted file system on your computer anyway, because your laptop being stolen is a far more dangerous threat than some rogue amazon employee targeting your particular slice of enormously huge glacier datacenter. So encfs for the sensitive stuff, ordinary protection for the rest. encfs happens to sync just fine with unison, so I have my sensitive stuff ready and mountable on every computer I sync to.
So, with that out of the way, syncing commence!
cd /path/to/storage ; /usr/bin/mtglacier sync --config=glacier/me.cfg --dir=me --vault=me-vault-name --journal=glacier/me.log --new --replace-modified
journal.log is where mtglacier keeps track of what's in the vault so it knows what to sync. You can restore without it.
The command above goes into a daily cron job on my pi.
# on the pi
# crontab for user
30 4 * * * cd /path/to/storage ; /usr/bin/mtglacier sync --config=glacier/me.cfg --dir=me --vault=me-vault-name --journal=glacier/me.log --new --replace-modified
This backs up to glacier every morning at 4.30.
So, there we are! Secure synced directory with Open Source software only, and the cheapest commercial backup service around. I'm fine with commercial backup services as long as they have no closed-source clients running on my hardware.
So far we only have one directory. I have found that this is fine in most cases, for example I can just have my phone rsync my pictures and voice messages right into my sync folder on the pi, and the changes will go everywhere on the next sync. But when sharing with others, I find it nice to set up one user on the pi per synced directory, and then administrate who has access to the share by adding and removing those people's public key from the authorized_keys file of that user (beyond the scope). Having one pi user per parcicipant doesn't scale well and leaves you in a permissions jungle. Then, an ssh chroot can be used to restrict that user's access beyond the home directory. An own glacier vault, config and cron job should be used for each shared synced directory so they can't get in each others way.