Backup

From Freehackers
Jump to: navigation, search

Links

Note that rsync is a very good synchronization tool, but is not a backup solution. It can not roll back to a previous version of a file and provides no encryption.


Computer/Server Solutions

Our criteria are:

  1. network based (as opposed to cardridge, disks..)
  2. Data encrypted on the source (so that the remote storage only contain encrypted data). "No trust on the remote storage".
  3. secure transport (ssh) (less important if the previous point is met)
  4. support incremental backup
  5. easy to include/select files/dir to be backed up, and to know where gigabytes come from. Gui?
  6. can be used without GUI (for servers)
  7. free software
  8. provide "point-in-time" recovery
  9. the client (server being backed up) connects to the storage server, not the other way around
  10. can handle rename / moves

Some useful links:

rsync + filesystem snapshot

Used by orzel for backing up user data to a local server. Can't be used for servers because of the lack of encryption.

A simple way to overcome the main rsync shortcoming (no history) is to use filesystem snapshots. The solution is very simple and easy to put in place. Regarding the criteria, we loose the "data encrypted on the source" and "can handle moves/renames", but we have

  • network based (not designed around tape or whatever)
  • secure transport (rsync over ssh)
  • incremental backup for free
  • select dir/files for restoring and checking hard disk use is just a matter of using a normal fs (du, cp..)
  • client connect to backup server (rsync would even allow the other way)

cons:

  • no encryption

Snapshotting is supported by:

  • ZFS : licensing problem, requires a solaris compatibility layer
  • btrfs : stable enough ?
  • LVM

rsync + hardlinking

You need some scripting but rsync can be told about a previous snapshot and use it (hardlinks).

cons:

  • no encryption


BorgBackup

"Deduplicating backup program", written in python 3 + core in C. Remote storage is done through ssh. Requires borgbackup on the other side. The main promise is that the deduplication algorithm is very good. (for the record, it's a fork of almost unknown attic backup).

If fulfills all our criteria but "5.easy to include/select files/dir to be backed up, and to know where gigabytes come from." Which is pretty unique in this page.


Pros:

  • encrypts on client
  • can fuse-mount backups, just as obnam
  • web frontend
  • can be used as a standalone binary (including python)
  • in gentoo: app-backup/borgbackup
  • python (even better: python3)

It fixes the main concerns with attic (data corruption on large repositories and not handling sparce/special files).

As of december 2017, I (orzel) am migrating most my backups to borg.

Ugarit

New kid on the block as of january 2013

Good

  • can be encrypted from the start, so we can use untrusted storage

Bad

  • written in .. ? Scheme ?
  • you don't get much when downloading the stuff

Duplicity

Once used by orzel.

Pros:

  • does encryption from the client (gnupg)
  • incremental (uses librsync)
  • transport through ssh (encryption + convenience of keys)

Cons:

  • It's hard to specify exclude/include dir
  • even harder to check for the amount of data the current setting corresponds to

backuppc

(used by Benoit). It aims at backup of lot of similar PCs on a local network. Everything is done FROM the backup PC, only requiring ssh/rsync to be installed on backed up PCs. I guess the main idea is to save space using deduplication, hopeing pcs have lot in common.

Pros:

  • there is a web user interface, a lot better than bacula
  • same files present on different PCs are stored only once, with hardlinks
  • rsync-like optimization

Cons:

  • written in perl
  • the backup server keeps ssh keys for all pc -> high security concern for me (orzel)
  • only one place where backup are stored, and it needs to be a single partition (hard links)
  • no encryption
  • the server need to be able to reach every client, the communication is not started from the client. This doesn't work (easily) when clients are behind firewalls, and pose severe security issues (storage server is a big point of failure).

rsnaphot

User-land way of creating snapshots, based on rsync + hard links. Snapshots are available as normal directories, and as such it is very simple to restore.

(rdiff-backup as similar approcah)

Pros

  • transport on ssh

Cons:

  • written in perl
  • no encryption on remote device
  • no easy include/select

rdiff-backup

Similar to rsnapshot. This is a script based on librsync, though it stores delta, so that you can actually restore deleted/old content.

Pros

  • transport on ssh
  • written in python
  • some people mentionned that its best advantage is a high compatibility even with very old versions.

Cons:

  • no encryption on remote device
  • no easy include/select
  • as of 2016, the last release was in 2009, the project is dead

Amanda (wikipedia)

Amanda is specifically designed to backup a lot of clients onto one big server ("single master backup server to back up multiple hosts over network").

Doc cite 'excluding' and 'server-side-gpg-encrypted backups'

Seems oriented more toward storage device backup, less toward network, although the website is titled "Amanda Network Backup". Contrary to the outdated boxbackup comparison (linked at the beginning of the page), amanda can now handle encryption.

Pros:

  • has graphics

cons:

  • You need (free) registration for some of the doc!
  • very complicated setup

rdup

Heavily based on other (unix-)tools to do the work. It's not ready-to-use, but can provide a good basis to a custom-made solution.

"rdup will only print a list of files that are changed/removed since the last time rdup ran.". Several (example) scripts are provided to do something useful with this list, such as tar/compress/encrypt and copy on a remote location.

Bareos

Fork of Bacula, with new features added. In 2016 said to be dynamic.

There's a deamon running on the system, talking with the master/orchestrator.

No useful web frontend, but the cli recovery tool seems ok. You can browse your backup and select dir/files to be restored.

(recommended in French SILL2017)

Bacula

  • It has been tested by orzel for more than a year with two fd, three sd and one dir. I'm really dissapointed, it's really difficult to configure, to maintain, to encrypt, to know what happens....
  • Does support encryption
  • there are several guis (console,wx,gnome,web, even qt?)
  • web interface, not available in gentoo, although it has been asked since beginning of 2006
  • as of 2016, even though there are some releases done, not much happen. BareOS seems to be the new path.

Pro:

  • powerful
  • great pdf/html documentation
  • nice separation of master, storage daemon(where to store) and file daemon(where to backup from)

Cons:

  • really designed for tapes, not network
  • very hard to configure and maintain
  • i've never had any answer on the irc channel
  • can't do 'simple' data encryption (symmetric password for ex.). Configuration of this is hard.
  • (gui/text) interface hard to understand
  • only "Most of the Bacula source code is released under the GPL version 2 license."
  • bat (qt-based gui) sucks and is only a (bad) frontend on top of the text-based command line interface

Brackup

I haven't had time yet to review carefully this one

Pros:

  • it does encrypt before sending
  • alive

Cons:

  • written in perl
  • homepage scarse, it seems as a very early snapshot.

cumulus

Not really mainstream. It's aiming at backup on cloud services, so requires the minimal for storage (push,pull). Explicitely mentions it's like duplicity,boxbackup and brackup.

Pros

  • it does encrypt before sending
  • rsync-like optimization for storage/bandwidth.

cons

Bup

Can't delete old data. Slow (according to this link)

zbackup

Globally-deduplicating, c++, not yet in gentoo. The input is supposed to be a tar. Not a directory. You can't restore a file, only the whole stuff. Doesn't handle network / remote storage. As a result i guess it validates the point "encrypt on client".

Website says it's similar to attic, which is where borgbackup comes from, similar to obnam.

Syncany

Rather a sync tool, but can be used for backups. It doesn't handle/store unix attributes (uid, gid, hard/soft links). it uses gradle, java, readthedoc, yaml.

Databases

Holland

Done in python, available in gentoo, has different backends/plugins, among them xtrabackup (next entry), mysqldump and lvm. They intend to do more than databases one day.

xtrabackup

It's only half-open-source though. Seems to be the preferred way of doing hot-backup with mysql/innodb. You really should not do filesystem based backup of innodb files.

Partition/filesystem

Partimage

Partimage is a 'disk cloning' utility, not only geared toward backups. It needs to know the filesystem structure, though.

Mondo Rescue

Backup whole partitions, but need to know/be aware of the file system.

CloneZilla

"bar metal" : works at the block level of hard disk, kinda similar to online/continue ghosts.


Online Solutions

We keep the main criteria such as encrypted transfer and storage, but require that the encryption is done by the client, with keys generated on the client, so that the online service has no way of knowing what's stored there. I dont want 'online browsing' of my data or anything that require the provider to be able to access to what i'm storing.


Dropbox

http://www.dropbox.com/

Often cited in online backup solutions, but it's rather a online storage solution. It doesn't encrypt data on their server, so you have to trust them.

Owncloud

https://owncloud.org/

It's mostly a dropbox/googledrive clone. Not designed for backup, but it can synchonize a directory with the server, including keeping old copy of files ('history'). See also Document Sharing#Owncloud


Written in php, free software you can install on your own server, although the company doing it is going more and more closed. They can somehow do encryption but only on external storage, and the whole stuff is cumbresome : very hard to configure, highly inefficient (files are sent to nginx, then again to php/owncloud, encrypted, then again to external device, all of this eating ram/harddisk/bandwidth).

Pro:

  • network-based
  • secure transport (can do https)
  • support history of files, but you have no guaranty of how long / how many.

cons:

  • no encryption at source
  • unreliable, lot or reports of the sync client removing files at random.
  • no point-in-time-recovery

unknown:

  • include/exclude ?
  • sync client for server (headless)

Crashplan

http://www.crashplan.com/

Works with java, so probably not meeting criteria 6

http://www.linuxlinks.com/article/20101023105127302/CrashPlan.html

Has an interesting feature when you can use it between your own computer, not using their storage space.

Spideroak

https://spideroak.com

"zero knowledge policy": they explictely say that "not even SpiderOak employees can access the data". Though on demo videos, we can see that files/directory names are not encrypted, probably only the content.

The gui provide a include/exclude tree stuff. Probably not usable on a server because of the gui.

Wuala

http://www.wuala.com/

Works with java, so probably not meeting criteria 6

They explicitely say that "Even the employees at Wuala or LaCie do not have access to your private data".


Ubuntu One

https://one.ubuntu.com/ The data is not encrypted on the storage, not meetint criteria 2

Local personal backup

Those solutions are mainly for single-user, often highly influenced by Apple's Time machine. Our criteria are

  1. possible to store backup remotely
  2. encrypted stream
  3. nice GUI
  4. free software

Duplicati (wikipedia)

It's written with C# and is supposed to bring the "Unix-only" duplicty to all platform. It's also bringing a GUI, and i guess that it's not usable as command line as a result. The restore interface is said to be great. It's "similar but not compatible" with duplicity. Everything is done from the client, server is any kind of storage (ftp/ssh/S3...).

Pro

  • encryption/signing as duplicity
  • very nice demo on youtube

Cons:

areca

It supports encryption and lot of other interesting stuff. It is written in java and has a GUI. As of november 2010 it's still not in gentoo (though there's an issue about it) nor on any other distribution but ubuntu. Pros:

  • nice file browser with directory sizes
  • nice simulation mode
  • nice 'filter' thing, you can exclude files, directories, regexp..
  • support storing backups on a remote computer through ftp/ftps

Cons:

  • labelled "personal"
  • the gentoo bug ticket mentions it's very difficult to install/compile.
  • java

kbackup

No encryption, full gui (can't be used with cron on servers), (using version 0.7) i could not have it make an incremental backup, though it's supposed to be possible.

The GUI is great for adding/removing parts of the tree and create a incluide/exclude list such as :

 M /home/save/tests
 P 
 S 0
 R 0
 F 1
 C 0
 Z 0
 I /home/orzel/.kde4
 I /home/orzel/bin
 I /home/orzel/fac/These
 I /home/orzel/share
 E /home/orzel/.kde4/share/apps
 E /home/orzel/.kde4/share/config/session

Backup manager

No encryption. cli, clean backup after some time, can do incremental.

Handle ssh/ftp/rsync for upload, and handle svn/mysql in a dedicated way

http://www.backup-manager.org (It seems they lost the domain name in 2012, it's something else in japanese now ..) Still available (in 2015) in Gentoo thoug (app-backup/backup-manager)

Once cited in linuxfr

Cons:

  • No encryption :-(
  • Written in bash and perl

TimeVault

Typical 'local personal backup' solution based on hard links stuff. Present in gentoo (app-backup/timevault).

Flyback

Yet another 'local personal backup' solution based on rsync-style stuff. Not present in gentoo.

DirSync Pro

Doesn't do network. Not present in gentoo.

luckyBackup

Can do network, but does not encrypt data on the remote location. Present in gentoo (app-backup/luckybackup). Provides a useful "dry-run" feature to check what will be done.

pybackpack

Frontend for rsync-diff based on python/gtk. Said to provide a "nice" GUI to exclude files. untested. I dont know about network or encryption. Present in gentoo (app-backup/pybackpack).


fwbackups

Based on python, i dont know about the backend. The next version will be be done using c++/Qt4/cmake. Can store remotely, and it uses pycrypto so there must be some crypto somewhere. Not present in gentoo

Déjà Dup

Frontend for duplicity. Not present in gentoo. One of the few 'personal backup' tools providing encryption.

Back in Time

app-backup/backintime in gentoo.

It's a frontend for rsync, written in python, and with a kde or gnome interface. Use rsync + hardlinks for "snapshots". Has an exclude system, which i guess it the one from rsync.

Old/obsolete/unmaintained ones

FlexBackup

Present itself as a something between between 'tar' and 'amanda' Can do incremental.

As of february 2016, the last release (1.2.1) is still from oct 10, 2003

Keep(KDE)

KDE graphical interface based on rdiff-backup Only handle local directories. Is small and only for 'local personal backup'.

Officially unmaintained.

hdup, hdup2, hdup16 and others

cron based, support for .nobackup

According to boxbackup, does not handle rsync-kind optimizations.

Officially unmaintained, the author switched to develop and use rdup.

Distributed Internet Backup System (DIBS)

Python tool. Use some fun algorithm to spread data among peers so that only a % of that is needed for recovery. Use python for transport, not ssh.

Pros

  • specifically aimed at network, not tapes
  • data is encrypted on the local computer (gpg) before sending/storing.
  • written in python

Cons:

  • last release on may 2006
  • use CVS
  • no rsync-like optimization


Boxbackup

http://boxbackup.org (wikipedia)

Dead, though not officially. They haven't released of fixed bugs for years, even gentoo has removed it.

Was once used by orzel on several systems.

It can use some kind of raid-userspace-thinguy to do the same as raid but on a higher level, on top of filesystem.

There was a gui, using wxWidgets, not in gentoo, which seems abandoned. There's also a web gui written in python.

good:

  • Designed for network (not tapes) and with encryption from the start.
  • documentation seems ok
  • minimalist configuration and changes on every computer.
  • nice exclude stuff based on regexp
  • rsync-like optimization, but keeps history.
  • for big enough files, it tracks them and notice when they are moved. So that 'mv big big2', will not trigger the whole file to be sent again.
  • continuous backup

bad:

  • no irc
  • the project is not very alive. As of 2015, the last release was >2.5 years ago, and even then, it was only a dot one (still no 1.0)
  • it is either difficult to install or badly supported by gentoo
  • retrieving files, especially old or deleted ones is difficult, done through a very rough text interface.
  • no 'point-in-time' recovery. They are aware of this feature missing and more or less working on it
  • hard to know what is included/excluded, or where the gigabytes used come from

http://boxbackup.org/trac/wiki/BoxComparison


Obnam

Officially stopped on August, 2017.

Command line, written in python. Sends data over ssh/paramiko, in either way (push/pull). Use deduplication between files and between snapshots, which checks my point "can handle moves/renames".

Used by orzel on several servers. It worked well. obnam mount is very very slow, but damn useful.


Pros:

  • you can (fuse-) mount each generation for each backup. Easy to see, check, restore.
  • handle halt, will use data already transferred on next try
  • there's a command to verify/check existing backups
  • encrypted using gnupg (on the origin)
  • good documentation (man page)
  • can output a file readable by kdirstat, which validates my point "easy to know where data comes from". I haven't managed to read this file with k4dirstat though.

cons:

  • You can use regexp for the "exclude" setting, but not for the "root" setting (list of things to backup)
  • "root" can only be directories, there's no (easy?) way to explicitely backup some given files
  • checks against "root" are not done at start, so in case of error, it will fail long after start
  • it's slow
  • encryption is an after-thought and relies on 3rd party(pgp), not really integrated

Non-Free

Arkeia : some friends provided the feedback "doesn't work, we even got a refund"