Thursday, October 16, 2008

DAR backup

Well I started looking at the different file base alternatives, but relatively quickly put many of them on hold. They each have there benefits, and I'll try to get back to them soon - but here in bugwhine style, I'll just explain why I put them on hold.

unison
Interesting file synchronizing tool.
Unfortunately on hold, as it's written in Objective Caml, so building for Linpus may require a bunch of work, I don't have time to follow this path right now.

rdiff-backup
In brief - File based backup directory -> directory. Hard links and special files, permissions etc. are preserved.
Incremental diffs are preserved in a special directory.
rpm available at livna (rdiff-backup.i386)

Why on hold -
- requires own server on NAS
- backup to/from cifs/samba mount not recommended
- file based

rsnapshot
A filesystem snapshot tool, using rsync.
rpm available at livna - rsnapshot.noarch
Why on hold -
- snapshots may become 'messy' on the backup filesystem
- written in perl (*)

duplicity
Uploads encrypted tar volumes to remote system. Uses librsync.
rpm available at livna - duplicity.i386

Why on hold -
- doesn't handle hard links
- written in python (*)

DAR - Disk Archive
In the end DAR was the first utility I made a successful full and incremental backup with (ie. without putting it on hold). It's available at livna -

[root@localhost ~]# yum install dar.i386

My first impressions from the documentation and the command line switches was that it looked a bit primitive, and that maybe I should have not given up so easily on the other tools.

I tried a complete backup first as follows -

[root@localhost ~]# dar -v -z -R / -P dev -P proc -P mnt -P sys -P tmp -c /mnt/nas/aspire/full

and remained skeptical. It took quite some time, but the resulting file was a compact 915MB.
To put it further to the test, I tried an incremental backup -

[root@localhost ~]# dar -v -z -R / -P dev -P proc -P mnt -P sys -P tmp -A /mnt/nas/aspire/full -c /mnt/nas/aspire/diff1
...
--------------------------------------------
59 inode(s) saved
with 2324 hard link(s) recorded
0 inode(s) changed at the moment of the backup
80778 inode(s) not saved (no inode/file change)
0 inode(s) failed to save (filesystem error)
5 inode(s) ignored (excluded by filters)
0 inode(s) recorded as deleted from reference backup
--------------------------------------------
Total number of inode considered: 80842
--------------------------------------------

At this point my skepticism had almost completely died. The incremental backup was very fast, and efficient. Being 'tar' style (single file on the destination device), there aren't issues with cifs/samba permissions. A clean result, but still with the possibility to quickly restore individual files.

[root@localhost ~]# du -h /mnt/nas/aspire/*
3.0M /mnt/nas/aspire/diff1.1.dar
915M /mnt/nas/aspire/full.1.dar

The remaining thing bothering me, is manually specifying the paths to exclude from the backup. Perhaps a small thing, and perhaps something I should tune further.

I need to look further into a backup schedule, recovery methods and management of the archives with dar_manager - but this feels like a good start.

Thumbs up DAR.

(*) Perl and Python fans - I'm not against these languages at all. I just find that scripted solutions tend to be less well integrated than compiled solutions. Output messsages, exception handling etc, is often unprofessional in solutions based on scripting languages - and it was fairly clear that this is true also for rsnapshot and duplicity.

No comments: