We are currently evaluating a few Free Software backup systems. BackupPC is the one we’ve decided to put through its paces with a few weeks of real-world work, parallel to our existing system. So far, we’ve been using aging home-grown Perl scripts and the (quite excellent but sometimes dogged) rsnapshot.
The most interesting feature I’ve seen in BackupPC so far is file pooling. The project page puts it best:
- A clever pooling scheme minimizes disk storage and disk I/O. Identical files across multiple backups of the same or different PCs are stored only once resulting in substantial savings in disk storage and disk I/O.
- One example of disk use: 95 latops with each full backup averaging 3.6GB each, and each incremental averaging about 0.3GB. Storing three weekly full backups and six incremental backups per laptop is around 1200GB of raw data, but because of pooling and compression only 150GB is needed.
Intwestin’! This should save us a lot of space, especially once we’ve finished virtualizing. Virtualized servers, shared LAMP servers and the like can profit massively from file pooling, as they will be sharing the same phyiscal space on the backup server. Even if 200 virtual servers share 300 MB each of system files, for example, all of their backups together will only take up a bit more than 300 MB instead of almost 60 GB. Even if the file’s attributes, owner, mtime, ctime etc. differ, the physical space they take up remains the same because BackupPC tracks file metadata separately from file data.
In addition to giving us more flexibility, more efficient use of our hardware, easier configuration and a better visual representation of our network’s backup status, BackupPC could theoretically also backup our users’ laptops. It doesn’t care which address the clients use, where they connect from etc. as it can pull the files incrementally and encrypted over rsync, for example. It even sends out messages to lazy users who haven’t backed up their machine in a while.
I’m still sceptical of the claims of being a full “enterprise-grade system”, but there are certainly enough features here to warrant some time investment into a more or less formal evaluation. Fabian already went ahead and put the thing into service on our new backup system, now I’m trailing behind in documentation lecture, but once I’ve caught up I’ll be sure to post about my experience with implanting BackupPC into an existing infrastructure.