Backup via rsync

James Young · October 20, 2011

Since my Microserver keeled over and died and I didn’t have a good backup in place, I was in search of a quick and easy way to back up the USB key on a regular basis so I could recover easily.

The first thing I did was to run a daily cron job to tar up the whole USB key and shove it onto rust somewhere.  This went OK, but it wasn’t very elegant.

So, enter the below script.  I got this off a work colleague of mine, and adapted it slightly to suit my purposes…

#!/bin/sh
#
# Script adapted from backup script written by David Monro
#
SCRIPTDIR=/data/backups
BACKUPDIR=/data/backups
SOURCE=”/”
COMPLETE=yes
clonemode=no
while getopts “d:c” opt
do
        case $opt in
                d) date=$OPTARG
                ;;
                c) clonemode=yes
                ;;
        esac
done</p>

echo $date
echo $clonemode
if [ "$clonemode" = "yes" ]
then
        SOURCE="$BACKUPDIR/current/"
        COMPLETE=no
fi

mkdir -p $BACKUPDIR/incomplete \
  && cd $BACKUPDIR \
  && rsync -av --numeric-ids --delete \
    --exclude-from=$SCRIPTDIR/excludelist \
    --link-dest=$BACKUPDIR/current/ \
    $SOURCE $BACKUPDIR/incomplete/ \
    || COMPLETE=no
if [ "$COMPLETE" = "yes" ]
then
    date=`date "+%Y%m%d.%H%M%S"`
    echo "completing - moving current link"
    mv $BACKUPDIR/incomplete $BACKUPDIR/$date \
      && rm -f $BACKUPDIR/current \
      && ln -s $date $BACKUPDIR/current
else
    echo "not renaming or linking to \"current\""
fi</span>

</blockquote>

What this will do is generate a backup named for the current date and time in the specified location.  This script will find the latest current backup, and will then generate a new backup hardlinked to the previous one, saving a LOT of disk space.  Each backup will then only contain the changes made since the last one.
After running this for a while (about a month), I've now got a 4.6Gb backups folder, with a 1.7Gb base backup - so a month's worth of daily backups has only take up double the space of a single backup.  Note however that there's been some inefficiencies around updatedb that has blown more disk space than otherwise should be the case.
In order to check the size of each backup, just do a "du -sch *" in the folder your backups are in.
A very important safety tip.  Since each file is generate through hardlinks, do not under any circumstances try and edit files in the backups.  Let's say you haven't changed /etc/passwd in a long time.  While it looks like you have 30 copies of it, you actually only have one (ie, the hardlink).
If you go and edit /etc/passwd, you will change it on every backup at once, effectively.  So don't do that.  It's safe to just flat-out delete a backup, you won't trash anything.  Just don't edit things.

Twitter, Facebook