Decentralised backup to Raspberry Pi grid with Duply & Tahoe

December 05, 2016 in #backup #duply #tahoe-lafs #raspberry pi | | | Share on Google+

duply-tahoe-lafs-raspberry-pi-grid-docker

Cheap, simple, incremental, decentralised & encrypted backup for your infra

From this setup you will empower your backup with the following:

  • Cheap: the storage is a brunch of Raspberry Pi 3, can't get more discounted data hosts ;-)
  • Simple: everything runs in Docker, quick to deploy & redeploy
  • Incremental: Duply takes care of the incremental backup jobs
  • Decentralised & encrypted: Tahoe-LAFS will manage the encoding and the redundancy of the data on the Raspberry Pi grid

1. How they all work together

duply-tahoe-lafs-infra

Duply: the backup software

  • Runs on the backup server which holds the SSH key to access the Raspberry grid
  • Uses Duplicity/RSync to incrementally backup files
  • Can encrypt the backup (or leave the task to the backend)
  • Then sends the file to the backend which is either:
    • A shared folder on the network
    • An AWS S3 bucket (few steps will be described)
    • A Tahoe-lafs instance: a decentralised encrypted storage grid made of Raspberry Pi, or even some home computer or cloud servers (full doc below)

Tahoe-LAFS: the backend storage

tahoe-lafs-admin-gui Tahoe is a great and powerful open source project, which permits to encrypt your data, and send them to many different targets, where none can read your data, without the keys. This project is similar to Storj or Sia (some cool crypto storage projects), or a kind of a Torrent, where you could not read the file you are sharing on your computer.

Backup server

Like every infra setup, you will need a way for the files to get pickup by Duply. You can either let each server pushes their files to a folder on Duply server, or you can give Duply the key to pull the remote files. While this last solution gives all the right to the Duply server, we will use that configuration because:

  • All backup jobs are centralized and easy to monitor
  • As this server possesses all keys (e.g. security point of failure), it should be well secured and monitored. The deployment server could be a good candidate for this role (cf bastion), because it holds already the SSH keys to access other servers.

Raspberry pi 3 grid: the backend storage target Raspberry-pi-3 While you can virtually store the backup on any hosts (Tahoe don't really care of the hardware as long as Tahoe storage can run and detect some free space), a Raspberry Pi is a perfect solution:

  • Cheap hardware to purchase: starting price 35$ (without storage)
  • Easy to setup: basically, you will pre-configure it in few steps with the guide below
  • Easy to deploy: this small device can then be distributed to employees, so many can connect it at home and join the network, making it bigger. The more decentralised, the better the security, redundancy and the total capacity.

2. Deploy

Follow the github to get this full backup solution running and enjoy secure backup!

Thank you for reading :-) See you in the next post! Greg

December 05, 2016 in #backup #duply #tahoe-lafs #raspberry pi | | | Share on Google+