Bacula – ran out of space, moved some volumes to another zpool

I’m using Bacula 9.0.3 for this post, on FreeBSD 10.3 and 11.1.

I did not document this as I went along, however, the details should be enough to get you started.

NOTE: when I refer to the bacula-sd configuration, I mean the bacula-sd in question, not necessarily on the same server as bacula-dir. This will hopefully make you think carefully about which file you are modifying.

NOTE: I am moving a whole Bacula Pool. You could do this will all Pools or only some Volumes from a Pool (but that would involve much more magic than I am prepare to document).

This blog post contains unsupported modifications to the Bacula database. In short, you can really bugger stuff up following this. Proceed carefully.

I was running out of space on one zpool. This server had two zpools though:

$ zpool list
NAME        SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
system     45.2T  18.6T  26.7T         -     8%    41%  1.00x  ONLINE  -
tank_data  45.2T  35.6T  9.63T         -    18%    78%  1.00x  ONLINE  -
tank_fast   408G   348K   408G         -     0%     0%  1.00x  ONLINE  -
$ 

In the above, I have already created a new zfs filesystem under system and copied some of the Bacula Volumes from tank_data.

For what it’s worth, I created the new filesystem with these commands:

sudo zfs create system/data/bacula-volumes
sudo zfs set recordsize=1M   system/data/bacula-volumes
sudo zfs set compression=lz4 system/data/bacula-volumes
sudo zfs set exec=off        system/data/bacula-volumes
sudo zfs set setuid=off      system/data/bacula-volumes

I saved more detail in a gist on GitHub.

WHY?

This section was added on 2017.09.09.

This was asked on Google+ and on the Bacula Users Mailing list. I will summarize here. The order of these points is not significant.

  1. If the Pool is no longer required, it is easy to delete. Dividing data up this way is always a good idea, because of the flexibility for future manipulation it provides.
  2. If these ZFS datasets were each on separate devices, you’d get better concurrent throughput.
  3. Full backups are usually bigger, incremental backups are usually smaller, so you could just recordsize accordingly.
  4. If you assigned each client to a different pool, deleting their backups when the client leaves is now a simple matter of deleting the appropriate ZFS datasets.
  5. If you want to move a pool to a different bacula-sd, you move that dataset.
  6. snapshot retention – If you want a different snapshot retention policy for different types of backups, having each pool on a different ZFS dataset allows this.
  7. zfs replication – if you want to replicate some, but not all of your backups, having only that data on a separate ZFS dataset allows for this.

I hope that gives you some ideas for problems that have been mulling around in your head.

Mounting the new filesystem near the old one

The bacula-sd in question runs in a jail called bacula-sd-01. For historical and hysterical reasons, the existing filesystem looks like this:

[dan@knew:~] $ zfs list tank_data/crey/usr/local/bacula
NAME                              USED  AVAIL  REFER  MOUNTPOINT
tank_data/crey/usr/local/bacula  22.4T  5.47T  22.4T  /usr/jails/bacula-sd-01/usr/local/bacula

The newly created filesystem mounts very similar:

[dan@knew:~] $ zfs list system/data/bacula-volumes
NAME                         USED  AVAIL  REFER  MOUNTPOINT
system/data/bacula-volumes   681G  19.2T   681G  /usr/jails/bacula-sd-01/usr/local/bacula-2

bacula-sd.conf

On the bacula-sd server in question, I decided to split up my configuration file into three parts. bacula-sd.conf looks like this:

Storage {                             # definition of myself
  Name = bacula-sd-01-sd
  WorkingDirectory = "/usr/local/bacula/working"
  Pid Directory = "/var/run"
  Maximum Concurrent Jobs = 20

  TLS Require     = no
  TLS Enable      = yes
  TLS Verify Peer = no

  TLS CA Certificate File = /usr/local/etc/ssl/MyCA.crt

  TLS Certificate = /usr/local/etc/ssl/bacula-sd-01.int.example.org.crt
  TLS Key         = /usr/local/etc/ssl/bacula-sd-01.int.example.org.nopassword.key

}

#
# List Directors who are permitted to contact Storage daemon
#
Director {
  Name = bacula-dir
  Password = "[REDACTED]"
}


Messages {
  Name = Standard
  director = bacula-dir = all
}

@/usr/local/etc/bacula/bacula-sd-01.conf
@/usr/local/etc/bacula/bacula-sd-02.conf

bacula-sd-01.conf contains a copy of the original configuration (less the Messages, Director, and Storage clauses already present in bacula-sd.conf).

I copied bacula-sd-01.conf to bacula-sd-02.conf and started amending things. Here is what it looks like and I have highlighted the lines I changed (details appear below).

#
# Devices supported by this Storage daemon
# To connect, the Director's bacula-dir.conf must have the
#  same Name and MediaType. 
#

Device {
  Name           = Restore-Drive-2
  Media Type     = File2
  Archive Device = /usr/local/bacula-2/volumes
  LabelMedia     = yes
  Random Access  = yes
  AutomaticMount = yes
  RemovableMedia = no
  AlwaysOpen     = no

  Maximum Concurrent Jobs = 2
  Volume Poll Interval    = 15
}

Autochanger {
  Name = VirtualDisk-2

  Changer Device  = /dev/null
  Changer Command = /dev/null

  Device          = vDrive-10, vDrive-11, vDrive-12, vDrive-13
}

Device {
  Name           = vDrive-10
  Media Type     = File2
  Archive Device = /usr/local/bacula-2/volumes
  LabelMedia     = yes
  Random Access  = yes
  AutomaticMount = yes
  RemovableMedia = no
  AlwaysOpen     = no

  Autochanger    = yes
  Drive Index    = 0

  Maximum Concurrent Jobs = 1
  Volume Poll Interval    = 15
}

Device {
  Name           = vDrive-11
  Media Type     = File2
  Archive Device = /usr/local/bacula-2/volumes
  LabelMedia     = yes
  Random Access  = yes
  AutomaticMount = yes
  RemovableMedia = no
  AlwaysOpen     = no

  Autochanger    = yes
  Drive Index    = 1

  Maximum Concurrent Jobs = 1
  Volume Poll Interval    = 15
}

Device {
  Name           = vDrive-12
  Media Type     = File2
  Archive Device = /usr/local/bacula-2/volumes
  LabelMedia     = yes
  Random Access  = yes
  AutomaticMount = yes
  RemovableMedia = no
  AlwaysOpen     = no

  Autochanger    = yes
  Drive Index    = 2

  Maximum Concurrent Jobs = 1
  Volume Poll Interval    = 15
}

Device {
  Name           = vDrive-13
  Media Type     = File2
  Archive Device = /usr/local/bacula-2/volumes
  LabelMedia     = yes
  Random Access  = yes
  AutomaticMount = yes
  RemovableMedia = no
  AlwaysOpen     = no

  Autochanger    = yes
  Drive Index    = 3

  Maximum Concurrent Jobs = 1
  Volume Poll Interval    = 15
}

The changes in brief are:

  • 8 – I set aside one virtual drive for restores. I’ve not actually used/tested it, but it’s there. The original was Restore-Drive, but I called this one Restore-Drive-2. This theme continues throughout this file.
  • 10 – /usr/local/bacula-2/volumes is the location, within the jail, to which the Volumes have been moved. This value is also used in the other devices. See below.
  • 22 – The new AutoChanger for my soon-to-be-declared virtual drives is VirtualDisk-2.
  • 27 – The device names for the virtual drives associated with this AutoChanger.
  • 31 – The declaration of a virtual drive.
  • 32 – The MediaType. This declaration is critical to success. If you are moving all Volumes of this MediaType, please move along. If you are moving only some Volumes of this MediaType, this line must contain a new MediaType not previously used. I went for this unimaginative solution. Whatever value you chose, it absolutely must match the MediaType specified in the bacula-dir configuration (see below).
  • 33 – The new directory, previously mentioned.
  • 48-50,65-67,82-84 – see previous items; they are all similar.

bacula-dir.conf

This is the declaration of the bacula-sd in question. This is what I had before I started this exercise:

# Definiton of file storage device
Storage {
  Name       = bacula-sd-01-file
  Address    = bacula-sd-01.int.example.org                # N.B. Use a fully qualified name here
  SDPort     = 9103
  Password   = "[redacted]"

  AutoChanger = Yes

  Device     = VirtualDisk
  Media Type = File

  Maximum Concurrent Jobs = 27
}


# for restoring
Storage {
  Name       = bacula-sd-01-file-restore
  Address    = bacula-sd-01.int.example.org                # N.B. Use a fully qualified name here
  SDPort     = 9103
  Password   = "[same redacted value]"

  Device     = Restore-Drive
  Media Type = File

  Maximum Concurrent Jobs = 2
}

Note the value on line 10 and compare it to line 11 below.

This is the new solution:

# existing SD, renamed
# Definiton of file storage device
Storage {
  Name       = bacula-sd-01-file
  Address    = bacula-sd-01.int.example.org                # N.B. Use a fully qualified name here
  SDPort     = 9103
  Password   = "[redacted]"

  AutoChanger = Yes

  Device     = VirtualDisk-1
  Media Type = File

  Maximum Concurrent Jobs = 27
}


# for restoring
Storage {
  Name       = bacula-sd-01-file-restore
  Address    = bacula-sd-01.int.example.org                # N.B. Use a fully qualified name here
  SDPort     = 9103
  Password   = "[same redacted value]"

  Device     = Restore-Drive-1
  Media Type = File

  Maximum Concurrent Jobs = 2
}


# new SD, similarly named
# Definiton of file storage device
Storage {
  Name       = bacula-sd-02-file
  Address    = bacula-sd-01.int.example.org                # N.B. Use a fully qualified name here
  SDPort     = 9103
  Password   = "[same redacted value]"

  AutoChanger = Yes

  Device     = VirtualDisk-2
  Media Type = File2

  Maximum Concurrent Jobs = 27
}


# for restoring
Storage {
  Name       = bacula-sd-02-file-restore
  Address    = bacula-sd-01.int.example.org                # N.B. Use a fully qualified name here
  SDPort     = 9103
  Password   = "[same redacted value]"

  Device     = Restore-Drive-2
  Media Type = File2

  Maximum Concurrent Jobs = 2
}

Of note:

  1. The Address is the same in both Storage declarations. This is because we are using one running bacula-sd at the location.
  2. The same Password is used for both. That is because there only one bacula-sd.
  3. Yes, by changing the Name, you can have multiple Storage declarations each serviced by the same single instance of bacula-sd.
  4. Line 11 – I have renamed the existing Device from VirtualDisk to VirtualDisk-1. This wasn’t really necessary, but I wanted it to be clear.
  5. Line 42-43 – A different Device name and a different MediaType. It absolutely must match the MediaType specified in the bacula-sd configuration.
  6. Line 47 – same MediaType as above, and as specified in the bacula-sd configuration.

Pool changes

This is before:

Pool {
  Name             = FullFileNoNextPool
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 3 years
  Storage          = bacula-sd-01-file

  Maximum Volume Bytes = 5G
  Maximum Volume Jobs  = 1
  Maximum Volumes      = 250

  LabelFormat = "FullAutoNoNextPool-"
}

This is after:

Pool {
  Name             = FullFileNoNextPool
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 3 years
  Storage          = bacula-sd-02-file

  Maximum Volume Bytes = 5G
  Maximum Volume Jobs  = 1
  Maximum Volumes      = 250

  LabelFormat = "FullAutoNoNextPool-"
}

The key change: the Storage device upon which the Pool is located.

Invoking your changes

After redoing the bacula-sd, restart it. I tested the configuration first, by doing this:

/usr/local/sbin/bacula-sd -t /usr/local/etc/bacula/bacula-sd.conf

Errors popped up, I fixed them.

After changing bacula-dir.conf, issue the reload command within bconsole. Fix errors. Repeat.

Update Pools

I made changes to my pools, so I wanted to update the Catalog with that information. The pool in question is FullFileNoNextPool.

*update pool
Using Catalog "MyCatalog"
The defined Pool resources are:
     1: FullFile
     2: DiffFile
     3: IncrFile
     4: IncrFileNoNextPool
     5: FullFileNoNextPool
     6: Fulls
     7: FullsLTO4
     8: Differentials
     9: Incrementals
    10: Scratch
    11: TwoHourlyBackups
    12: DailyBackups
    13: WeeklyBackups
    14: MonthlyBackups
    15: KeepThreeMonths
Select Pool resource (1-15): 5
+--------+--------------------+---------+---------+---------+------------+-----------------+--------------+----------------+------------+-------------+---------------+-----------+---------+----------+---------------------+---------+---------------+---------------+-----------+------------+--------------------+-------------------+---------------+---------------+
| poolid | name               | numvols | maxvols | useonce | usecatalog | acceptanyvolume | volretention | voluseduration | maxvoljobs | maxvolfiles | maxvolbytes   | autoprune | recycle | pooltype | labelformat         | enabled | scratchpoolid | recyclepoolid | labeltype | nextpoolid | migrationhighbytes | migrationlowbytes | migrationtime | actiononpurge |
+--------+--------------------+---------+---------+---------+------------+-----------------+--------------+----------------+------------+-------------+---------------+-----------+---------+----------+---------------------+---------+---------------+---------------+-----------+------------+--------------------+-------------------+---------------+---------------+
|     27 | FullFileNoNextPool |     248 |     250 |       0 |          1 |               0 |   94,608,000 |              0 |          1 |           0 | 5,368,709,120 |         1 |       1 | Backup   | FullAutoNoNextPool- |       1 |             0 |             0 |         0 |            |                    |                   |               |             0 |
+--------+--------------------+---------+---------+---------+------------+-----------------+--------------+----------------+------------+-------------+---------------+-----------+---------+----------+---------------------+---------+---------------+---------------+-----------+------------+--------------------+-------------------+---------------+---------------+
Pool DB record updated from resource.

The database changes

I’m pretty handy with SQL. I had a backup. Keep that in mind as you proceed.

The goal of the database changes is to get Bacula to know that the Volumes in question are now in a different location. Keep in mind two things:

  1. New MediaType
  2. New Storage

Let me find out what MediaType I am now using:

bacula=# select * from storage;
 storageid |             name              | autochanger 
-----------+-------------------------------+-------------
         7 | TapeLibrary                   |           0
         5 | FileRemoteTLS                 |           0
         6 | DLTRemoteTLS                  |           0
        10 | MegaFile-catalog              |           0
        11 | MegaFile-bast                 |           0
        12 | MegaFile-dbclone              |           0
        13 | MegaFile-kraken               |           0
        14 | MegaFile-laptop-freebsd       |           0
        15 | MegaFile-laptop-vista         |           0
        16 | MegaFile-latens               |           0
        17 | MegaFile-ngaio                |           0
        18 | MegaFile-nyi                  |           0
        20 | MegaFile-polo                 |           0
        21 | MegaFile-supernews            |           0
        23 | MegaFile-wocker               |           0
        19 | MegaFile-nz                   |           0
        22 | MegaFile-w2k                  |           0
         1 | File                          |           1
         3 | DLT                           |           0
        34 | tape01                        |           1
        24 | OverlandTapeLibrary           |           1
        29 | CreyFileExt                   |           0
        27 | CompaqStorageWorks            |           1
         2 | FileRemote                    |           0
         4 | DLTRemote                     |           0
        33 | tape01-sd                     |           1
         9 | MegaFile                      |           0
         8 | DigitalTapeLibrary            |           1
        32 | tape-TLO4-01                  |           1
        30 | Restore-Drive                 |           0
        26 | CreyFileExternalClients       |           0
        25 | CreyFile                      |           1
        31 | CreyFileRestore               |           0
        35 | bacula-sd-01-file             |           1
        36 | bacula-sd-01-file-restore     |           0
        39 | bacula-sd-02-file             |           1
        40 | bacula-sd-02-file-restore     |           0
        28 | CompaqStorageWorksTapeLibrary |           1
        38 | r610                          |           1
        37 | tape02                        |           1
(40 rows)

There. The storageid is 39.

What MediaType are we using now? Oh, that’s just a text field.

Here is the magic SQL. You are using transactions? You are using PostgreSQL?

bacula=# UPDATE media
   SET mediatype = 'File2',
       storageid = 39
 WHERE poolid = (SELECT poolid FROM pool WHERE name = 'FullFileNoNextPool');
UPDATE 248

Done. Great.

Confession: I did this UPDATE in two parts, but have combined them above for your SQL-ing pleasure. I’m sure nothing will go wrong.

Problems I hit

07-Sep 00:36 bacula-dir JobId 264514: Start Restore Job RestoreFiles.2017-09-07_00.36.16_10
07-Sep 00:36 bacula-dir JobId 264514: Using Device "vDrive-0" to read.
07-Sep 00:36 bacula-sd-01-sd JobId 264514: acquire.c:115 Changing read device. Want Media Type="File2" have="File"
 file device="vDrive-0" (/usr/local/bacula/volumes)
07-Sep 00:36 bacula-sd-01-sd JobId 264514: Fatal error: acquire.c:178 No suitable device found to read Volume "FullAutoNoNextPool-3387"
07-Sep 00:36 knew-fd JobId 264514: Fatal error: job.c:2484 Bad response from SD to Read Data command. Wanted 3000 OK data
, got len=11 msg="3000 error "

This was fixed with the mediatype = ‘File2’ clause in the SQL of the previous section.

07-Sep 00:16 bacula-dir JobId 264512: Start Backup JobId 264512, Job=dent.2017-09-06_23.30.00_05
07-Sep 00:18 bacula-sd-01-sd JobId 264512: Fatal error: Device reservation failed for JobId=264512: 
07-Sep 00:16 bacula-dir JobId 264512: Warning: bsock.c:107 Could not connect to Storage daemon on bacula-sd-01.int.unixathome.org:9103. ERR=Connection refused
Retrying ...
07-Sep 00:18 bacula-dir JobId 264512: Fatal error: 
     Storage daemon didn't accept Device "VirtualDisk" because:
     3924 Device "VirtualDisk" not in SD Device resources or no matching Media Type.

The Retrying occurred because bacula-sd was not running. After I started it, it encountered the problem shown. This situation arose because that job was queued before I made the configuration changes. The job in question retries if it cannot connect to the client.

Later on, as I was moving yet another pool into its own zfs dataset, I saw these warnings:

07-Sep 15:21 bacula-dir JobId 264551: Start Restore Job RestoreFiles.2017-09-07_15.21.48_58
07-Sep 15:21 bacula-dir JobId 264551: Using Device "vDrive-WeeklyBackups-0" to read.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-dir JobId 264551: Warning: Could not get storage resource 'bacula-sd-01-Weekly'.
07-Sep 15:21 bacula-sd-01-sd JobId 264551: Ready to read from volume "Weekly-4923" on file device "vDrive-WeeklyBackups-0" (/usr/local/bacula-2/volumes/WeeklyBackups).
07-Sep 15:21 bacula-sd-01-sd JobId 264551: Forward spacing Volume "Weekly-4923" to file:block 0:198.
07-Sep 15:23 bacula-sd-01-sd JobId 264551: End of Volume at file 2 on device "vDrive-WeeklyBackups-0" (/usr/local/bacula-2/volumes/WeeklyBackups), Volume "Weekly-4923"
07-Sep 15:23 bacula-sd-01-sd JobId 264551: End of all volumes.
07-Sep 15:23 bacula-sd-01-sd JobId 264551: Elapsed time=00:01:11, Transfer rate=148.9 M Bytes/second
07-Sep 15:23 bacula-dir JobId 264551: Bacula bacula-dir 7.4.7 (16Mar17):
 Build OS:               amd64-portbld-freebsd11.0 freebsd 11.0-RELEASE-p8
 JobId:                  264551
 Job:                    RestoreFiles.2017-09-07_15.21.48_58
 Restore Client:         knew-fd
 Start time:             07-Sep-2017 15:21:50
 End time:               07-Sep-2017 15:23:01
 Files Expected:         517,968
 Files Restored:         517,968
 Bytes Restored:         10,429,947,191
 Rate:                   146900.7 KB/s
 FD Errors:              0
 FD termination status:  OK
 SD termination status:  OK
 Termination:            Restore OK

07-Sep 15:23 bacula-dir JobId 264551: Begin pruning Jobs older than 3 years .
07-Sep 15:23 bacula-dir JobId 264551: No Jobs found to prune.
07-Sep 15:23 bacula-dir JobId 264551: Begin pruning Files.
07-Sep 15:23 bacula-dir JobId 264551: No Files found to prune.
07-Sep 15:23 bacula-dir JobId 264551: End auto prune.

Those warnings confused me. Then I realized I had renamed the Storage device specified in the Pool definition, but the records in the media table still referred to the old storageid. Rerunning my SQL with the correct values solved that issue.

Wow, I’m impressed

I was impressed that I was able to get this running in such a short time. It took less then two hours. I hope you can beat that time.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top