Bacula – calculating Maximum Volume Bytes and Maximum Volumes based on historical data

I’ve used Bacula since at least January 2004 (so nearly 20 years). I liked it so much I dropped my deployment-in-motion of another tool (if you search lightly, you can find out which one). I liked it so much, I wrote a PostgreSQL backend for it.

This post is not for Bacula novices. This post is for those who have already deployed a Bacula instance and are rather familiar with the process. I recommend this older post, which still applies today. For a basic introduction, please try Bacula on FreeBSD with ZFS.

This post will combine several of my favorites tools:

FreeBSD (13.2-RELEASE)
Bacula (9.6.7 – I am far behind on my upgrades here)
ZFS (2.1.9)
ZFS quota
ZFS reservation (perhaps)
ZFS snapshots (via syncoid)
jails

The primary goal of this post is setting Maximum Volume Bytes and Maximum Volumes on the new Bacula Pools and describing the methods used to come up with these values.

Background – moving and downsizing

I am moving an existing Bacula Storage Daemon (bacula-sd). I am also downsizing in the disk space available. I’ll be moving some backups via Bacula copy jobs, but that will not be covered by this post.

In configuring the new bacula-sd, I will be creating new Pools. In order to conserve disk space, I configure Bacula to recycle Volumes. Bacula does everything it can to avoid overwriting existing backups. I impose a restriction upon Bacula by limiting the maximum number of Volumes in a pool. How do I determine those numbers? Estimation, and historical data. In this posts, I’ll provide some sample queries I used in setting these values.

In this particular case, I’m moving an existing bacula-sd from one host (knew) to another (r730-03). I run my bacula instances in jails. This simplifies many administration tasks, including moving to another host.

The storage

I’ve created three ZFS filesystems to receive the backups:

[16:50 r730-03 dvl ~] % zfs list -r data01/bacula-volumes
NAME                             USED  AVAIL     REFER  MOUNTPOINT
data01/bacula-volumes            440K  8.84T       96K  /jails/bacula-sd-04/usr/local/bacula/volumes
data01/bacula-volumes/DiffFile   152K  8.84T       96K  /jails/bacula-sd-04/usr/local/bacula/volumes/DiffFile
data01/bacula-volumes/FullFile    96K  8.84T       96K  /jails/bacula-sd-04/usr/local/bacula/volumes/FullFile
data01/bacula-volumes/IncrFile    96K  8.84T       96K  /jails/bacula-sd-04/usr/local/bacula/volumes/IncrFile

I have set these non-default values for these datasets:

recordsize 1M
compression lz4
atime off

I plan to set these so the above mentioned filesystems do not fill up.

quota
reservation

Pools

These are the Pools I created for this new bacula-sd:

Pool {
  Name             = FullFile-04
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 1 years
  Storage          = bacula-sd-04-FullFile
  Next Pool        = Fulls

  Maximum Volume Bytes = 20G
  Maximum Volume Jobs  = 1
  Maximum Volumes      = 2000

  LabelFormat = "FullAuto-04-"
}

Pool {
  Name             = DiffFile-04
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 6 weeks
  Storage          = bacula-sd-04-DiffFile
  Next Pool        = Differentials

  Maximum Volume Bytes = 20G
  Maximum Volume Jobs  = 1
  Maximum Volumes      = 240

  LabelFormat = "DiffAuto-04-"
}

Pool {
  Name             = IncrFile-04
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 3 weeks
  Storage          = bacula-sd-04-IncrFile
  Next Pool        = Incrementals

  Maximum Volume Bytes = 20G
  Maximum Volume Jobs  = 1
  Maximum Volumes      = 450

  LabelFormat = "IncrAuto-04-"
}

The Name and LabelFormat include a -04 suffix to make them unique across all Volumes. This choice is personal preference and not logically linked to anything. The 04 corresponds to bacula-sd-04.

How did I get to those numbers?

I started with my data, in the Catalog, and looked at my backups for the past month. These queries are listed here mainly for my future use and I hope they are also useful to you.

In all cases, you’ll need to modify the queries if you want to use a different data range. I run full backups on the first Sunday, diffs on all other Sundays, and incrementals on all other remaining days. I limit Volume usage to a single Job.

Full jobs, run at the start of the month

Note that I am excluding Copy jobs in this query. They go to another host.

select * from job where starttime between '2023-08-06 00:00:00' and '2023-08-06 23:59:59'
 and level = 'F' and type != 'C' order by name;

How much disk spaces did the full backups use?

 select sum(jobbytes) from job where starttime between '2023-08-06 00:00:00' and '2023-08-06 23:59:59'
 and level = 'F' and type != 'C';

What jobmedia entries are involved?

These queries were used to create queries which follow later in this post.

 select jobmedia.* from job, jobmedia where starttime between '2023-08-06 00:00:00' and '2023-08-06 23:59:59'
 and level = 'F' and type != 'C' and job.jobid = jobmedia.jobid;

That’s a list of ALL the jobmedia records. The following is just a list of the Volumes used.

  select distinct jobmedia.mediaid from job, jobmedia where starttime between '2023-08-06 00:00:00' and '2023-08-06 23:59:59'
 and level = 'F' and type != 'C' and job.jobid = jobmedia.jobid;

Getting the space occupied on disk

The list of Volumes for the Full backups on the first Sunday of the month.

 select media.* from job, jobmedia, media where starttime between '2023-08-06 00:00:00' and '2023-08-06 23:59:59'
 and level = 'F' and type != 'C' and job.jobid = jobmedia.jobid and jobmedia.mediaid = media.mediaid;

Using that to get the space used:

 select sum(media.volbytes) from media where mediaid in (  
 select distinct jobmedia.mediaid from job, jobmedia where starttime between '2023-08-06 00:00:00' and '2023-08-06 23:59:59'
 and level = 'F' and type != 'C' and job.jobid = jobmedia.jobid);

About 460 GB – that’s all my full backups for a month.

What about incremental jobs?

The time periods vary with the type of Job, which relates to the Retention period involved.

Disk space used:

 select sum(jobbytes) from job where starttime between '2023-07-01 00:00:00' and '2023-07-21 00:00:00'
 and level = 'I' and type != 'C';

About 2532814783461 bytes or 2.5 TB – I get a lot of data churn for incrementals.

Number of Jobs:

  select count(*) from job where starttime between '2023-07-01 00:00:00' and '2023-07-21 00:00:00'
 and level = 'I' and type != 'C';

About 303

How many Volumes?

 select count(distinct jobmedia.mediaid) from job, jobmedia where starttime between '2023-07-01 00:00:00' and '2023-07-21 00:00:00'
 and level = 'I' and type != 'C' and job.jobid = jobmedia.jobid

About 432.

differentials

 select sum(jobbytes) from job where starttime between '2023-07-01 00:00:00' and '2023-08-14 00:00:00'
 and level = 'D' and type != 'C';

996623479547 bytes or about 1 TB.

  select count(*) from job where starttime between '2023-07-01 00:00:00' and '2023-08-14 00:00:00'
 and level = 'D' and type != 'C';

110 jobs

select count(distinct jobmedia.mediaid) from job, jobmedia where starttime between '2023-07-01 00:00:00' and '2023-08-14 00:00:00'
 and level = 'D' and type != 'C' and job.jobid = jobmedia.jobid

202 Volumes.

Other full backups during the month

There are other full backups I run daily throughout the month: git and subversion repos, pfSense configuration, etc.

In this case, I’m restricting this to the pool used for Full backups.

 select sum(job.jobbytes) from job  where starttime between '2023-08-07 00:00:00' and '2023-08-31 23:59:59'
 and level = 'F' and type != 'C' and poolid=22;

Those 36 jobs uses about 140 GB / month, or about 1.7 TB / year.

How many volumes does that use?

 select count(distinct jobmedia.mediaid) from job, jobmedia where starttime between '2023-08-07 00:00:00' and '2023-08-31 23:59:59'
 and level = 'F' and type != 'C' and job.jobid = jobmedia.jobid;

About 70 volumes or 70 * 12 = 840 a year.

 select count(distinct jobmedia.mediaid) from job, jobmedia where starttime between '2023-08-07 00:00:00' and '2023-08-31 23:59:59'
 and level = 'F' and type != 'C' and job.jobid = jobmedia.jobid;

70 Volumes. Or 70 * 12 = 840 more Volumes in the Full Pool over the year.

The figuring

Based on these values, I came up with these totals:

25 full backups a month – 430 GB
42 Volumes used
1 year of full backups is 430 * 12 = about 6 TB
1 year of full backups uses 25 * 42 = 1050 Volumes

These are rough figures. Taking the configuration presented near the start of this post, the summary is:

Pool Name	Maximum Volume Bytes	Maximum Volume Jobs	Maximum Volumes	Max size used
IncrFile-04	20G	1	500	10TB
DiffFile-04	20G	1	240	4.8TB
FullFile-04	20G	1	2000	40TB

You will notice that this requires 59TB of space. I have only 8.96TB free. How will this work?

What we have done is calculated the number of volumes used and set them to be 20GB max size. Full backups will use only 6TB, not 40TB. Not all Volumes are filled to the maximum capacity. Sure, when we have large jobs yes. Some of them will be full.

Similarly for Diff and Incr jobs.

However, this could still fill up the system. Let’s avoid that with ZFS.

ZFS quotas

With this statement, this ZFS file system cannot consume more than 6 TB for Full backups:

[18:58 r730-03 dvl ~] % sudo zfs set quota=6TB data01/bacula-volumes/FullFile
[18:58 r730-03 dvl ~] %

Similarly, we impose similar restrictions on the other file systems.

[18:58 r730-03 dvl ~] % sudo zfs set quota=1TB data01/bacula-volumes/DiffFile
[19:17 r730-03 dvl ~] % sudo zfs set quota=2.5TB data01/bacula-volumes/IncrFile

You will notice that I’ve imposed 9.5TB of quota on a filesystem with only 9 TB free. I’m in for trouble.

[19:23 r730-03 dvl ~] % zpool list data01
NAME     SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
data01  10.9T  1.94T  8.96T        -         -     0%    17%  1.00x    ONLINE  -

I just ordered another pair of 12 TB drives.