Jan 182014
 

This post has it all:

  • backups
  • deduplication
  • snapshots
  • ZFS
  • Bacula
  • ezjail

Backups are essential for proper sanity, or at least, a reasonable facsimile. I strongly believe that doing backups right is the only way to backup. Go big or go home. I’ve been converting all my servers to ZFS. I like ZFS for many reasons, and I’m going to list two:

  1. data integrity
  2. snapshots

In this case, instead of backing up the entire jail, I will be backing up only that part of the jail which is unique to that jail. In effect, I’ll be doing deduplication on the fly.

Creating the snapshots

I have taken to using sysutils/zfsnap when creating snapshots. After installing it, I added these entries to /etc/crontab:

$ tail /etc/crontab


# for snapshots

# Hourly recursive snapshots of an entire pool kept for 5 days, taken at 29 past the hour
# Minute   Hour   Day of month   Month   Day of week   Who    Command
29          *      *              *       *            root   /usr/local/sbin/zfSnap -a 5d -r system

# delete expired snapshots
0          1      *              *       *             root   /usr/local/sbin/zfSnap -d

The first entry, which runs every hour at 29 minutes past the hour, creates a snapshot for every filesystem within the pool named system. I chose 29 because I added this entry to the file at about 28 past the hour… I wanted the cron job to run soon. Pick your own time.

The second entry, deletes snapshots which have expired. In this case, that’s any snapshot older than 5 days.

For the purposes of this poost, I could have been more specific with the snapshot recursion. In my script below, I am only interested in jails. Therefore, I could have specified system/usr/local/jails instead of system and thereby created snapshots only for jails.

Selecting the jail snapshots for backup

Given the above, I created this little script, which identifies the latest snapshot1 to be backed up. This can be used as the basis for a Bacula File Set (I will write about that in a later post).

$ cat jails.sh
#!/bin/sh

JLS=/usr/sbin/jls

JAILS=`${JLS} -h path`

SNAPSHOTPREFIX='/.zfs/snapshot/'
SNAPSHOTSUFFIX='--5d'

# get the time one hour ago, because we know we've already created that snapshot
# the one from this hour may not be present yet, because it's not yet 29 past the hour.
SNAPSHOTDIR=`date -v-1H "+%Y-%m-%d_%H.29.00"`

for jail in ${JAILS}
do
  # ignore the one value of path, usually the first, because that's the header line.
  if [ ${jail} = 'path' ]
  then
     continue
  fi
  echo "${jail}${SNAPSHOTPREFIX}${SNAPSHOTDIR}${SNAPSHOTSUFFIX}"
done

Running that script, I get this output:

$ ./jails.sh
/usr/local/jails/webserver/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/svn.unixathome.org/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/mydev.unixathome.org/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/minion.unixathome.org/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/mailjailcopy/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/fileserver.unixathome.org/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula.unixathome.org/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula171/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula170/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula169/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula168/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula167/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula166/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula165/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula164/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula163/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula162/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula161/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula160/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula159/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula158/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula157/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula156/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula155/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula154/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula153/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula152/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula151/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/bacula150/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/jester.unixathome.org/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/testing/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/fedex/.zfs/snapshot/2014-01-18_20.29.00--5d
/usr/local/jails/serpico/.zfs/snapshot/2014-01-18_20.29.00--5d

OK, let’s compare that output with this simple check:

$ zfs list -t snapshot | grep serpico
system/usr/local/jails/serpico@2014-01-18_20.29.00--5d                                         527K      -   520M  -
system/usr/local/jails/serpico@2014-01-18_21.21.38--2h                                         384K      -   520M  -
system/usr/local/jails/serpico@2014-01-18_21.29.00--5d                                         272K      -   520M  -

In this case, I’m looking at all snapshots involving serpico, a jail on this system.

Yes, that snapshot is listed, the one created at 20:29:00.

1You will also see another, more recent backup. We could have used that one. But that would require more scripting. I mentioned above that the script would list the latest backup. That’s not quite true. If we run the script before 29 past the hour, it is true. If we run it at 29 past the hour or later, it’s not true.

What about Bacula

I wanted to get this shell script out for review, before I starting using it with Bacula. I am confident it will work, and will write about how to use it with Bacula later. If you’re in a hurry, read about Bacula FileSet Resource. Look for file-list is a list of directory.

Where’s the deduplication

ezjail makes use of read-only mount points for the basic jail infrastructure. If you use the Bacula Directive onefs=yes (his directive defaults to yes), only those parts of the jail which are unique to that jail will be backed up. More precisely, the common parts of each jail will not be backed up (e.g. /bin).

Granted, this is not true deduplication, but it is a fantastic start.

What’s next

I will let zfsnap run for a while and monitor it. Then I’ll start incorporating these snapshots into Bacula.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive