As of 2013-10, we have a CrashPlan cloud backup solution through the UK hosting provided CeeJay. We have currently got slots for 10 servers of about 100 GB each, with possibility of a certain float so that it is OK if one server is using 250 and another 30. 5 servers/virtual machines are currently backed up, but it
Recently, the price from CeeJay have however increased as upstream CrashPlan has increased, and we are going to reconsider other providers.
Among the features that are essential for myGrid's backup needs:
- Off-site storage of backup data
- Linux compatibility – all our servers are running Ubuntu LTS or CentOS
- Backup daemon must run unattended on server (no GUI for daily operation!)
- Local strong "personal" encryption (e.g. passphrase or local key) – the provider or an intruder at the provider should not be able to gain access to the files, e.g. no "I forgot my password" mechanism.
- Data centre in EU – EU data protection rules is not compatible with Patriot Act (perhaps a bit daft given the above, but there is even more reason to require this now as there is still the possibility of backdoor injection if the company is US based)
- Access to snapshots < 30 days – e.g. restore a file as it was 20 days ago
- Up to 100 GB per server
- 5-20 servers, ideally 10
- Current total of about 300 GB - desired up to 500
- Low maintenance – sensible defaults, easy to set up, easy to restore, and even more importantly, easy to monitor
- Email notification when backup is not working / incomplete
- Upfront payment per year – our project-based budgets and central finance department are not too keen on monthly charges
- Up to 300 GB per server – just one or two of our server hit this
- Additional Local backup target (e.g. server to server) – avoid bandwidth limitations in case of big restore
- Open source software
- Continuous backup (copies files as they changed rather check every night)
The following review is based on providers found through Google searches, Wikipedia and word of mouth.
Out of scope
These kind of p2p solutions allow you to get disk space by contributing your own disk space - and so you can reduce the cost by committing local bandwidth and extra diskspace - both of which we have excess of.
Make your own
- http://www.bacula.org/en/ - open source, pluggable to any backend storage. Requires an elaborate server setup with a hand-made configuration. GUIs available. Unclear if commercial offsite storage is available.
- http://dar.linux.free.fr/ - open source. Supports many kind of backends. Seems quite dated - the documentation refers to ZIP drives!
The University of Manchester has both Research Data Storage (redundant file store mountable using CIFS or NFS) and a shared area. Usage of both of these would require additional local software, e.g. rsync, rdiffbackup, bacula. While the RDS has two sites, one of these are in the same building as our servers, and so it would seem to be more vulnerable.
A variant of the above, but just server-to-server backup. The problem is that all our servers are in the same building. NEed to pair up with someone outside.
The conclusion is for now not clear. Providers to consider further:
- in-house + opensource