Decentralized backup to raspberry pi grid with duply & tahoe

December 05, 2016 in #backup #duply #tahoe-lafs #raspberry pi | | | Share on Google+

duply-tahoe-lafs-raspberry-pi-grid-docker

Cheap, easy, incremental, decentralized & encrypted backup for your infra

From this setup you will empower your backup with the following:

  • Cheap: the storage is a brunch of raspberry pi 3, can't get more discounted hosts ;-)
  • Easy: all is run in docker, quick to deploy & redeploy
  • incremental: duply takes care of the incremental backup jobs
  • Decentralize & encrypted: tahoe-lafs will manage the encoding and the redundancy of the data on the raspberry pi grid

1. How they all work together

duply-tahoe-lafs-infra

Duply: the backup software

  • runs on the backup server which has the ssh keys to access other servers
  • uses duplicity/rsync to incrementally backup local/remote folder
  • can encrypt (or leave the task to backend) sends backup to backend:
  • then send the file to backend:
    • Folder on the network
    • AWS S3 bucket (few steps will be described)
    • Tahoe-lafs: a decentralized encrypted storage grid made of Raspberry/homepc/server (full doc below)

Tahoe-lafs: the backend storage

tahoe-lafs-admin-gui Tahoe is a great and powerful open source project, which permit to encrypt your data, and send them to many different targets, where none can see any backup, without the special key present on the backup server. You can think of it like there cool crypto projects e.g. Storj or Sia, or a kind of torrent where you couldn't understand the local file on your pc.

Backup server

Like every infra setup, you will need to way for the files to get pickup by duply. You can either let each server push their files to a folder on duply server, or you can give duply the keys so it can pull the files remotely. While this last solution give all the right to the duply server, it is easier to manage backup because:

  • All backup jobs are centralized and easy to control
  • As this server possess all keys, we know he is the security point of failure, so should be locked and monitored. It could be the same server has the deployment server for example (cf bastion), which has already all access.
  • Anyway, if we had chosen the first solution where it hasn't have the keys, he will still have access to all backups.

Raspberry pi 3: backend storage target Raspberry-pi-3 While you can virtually store the backup on any hosts (tahoe don't really care of the hardware as long as tahoe storage can run an detect some free space), raspberry pi is a the perfect solution :

  • Cheap hardware to acquire: starting price 35$ (without storage)
  • Easy to setup: basically you will pre-configure it in few steps with the guide below
  • easy to deploy: this small device can then be distributed to employees, so many can enter the network and participate making it stronger. The more decentralized, the better the security, redundancy and total capacity.
  • Connect at home (1 power, 1 network cable) and forget it (admin GUI can monitor nodes)

2. Deploy

Follow the github link to get a this full backup solution running:
https://github.com/gregbkr/duply-tahoe-docker

& Enjoy secure backup!

December 05, 2016 in #backup #duply #tahoe-lafs #raspberry pi | | | Share on Google+