Backups are pretty amazing. The things you can do with them can be both highly entertaining and surprisingly easy.
Today, we will venture into the easy part.
Yesterday was the first Sunday of the month, which, according to my Bacula installation’s schedule is the day for full backups. Today, now that that is done, is copy-to-tape day according to my schedule. I do this manually because I don’t want the tape library, and the dedicated server which writes to it, to be running 24×7. Why not run it all day every day? Cost and heat.
It is a good idea to have backups. It is even better if your backups can be on two different media. This means you should still be able to restore if there is a failure of one. In my case, I backup to disk, then copy to tape, and move the tape to another location.
The job for doing the copy looks like this:
Job { Name = "CopyToTape-Full-LTO4" Type = Copy Level = Full Pool = FullFile NextPool = FullsLTO4 FileSet = "EmptyCopyToTape" Client = crey-fd Schedule = "Never" Storage = r610 Messages = Standard # Let's us normal priority, just so other backups can occur while we copy to # tape (e.g. mailjail) # Priority = 430 # RunAfterJob = "/usr/local/bacula/dlt-stats-r610" # run this on the host system, not the jail. trouble with xpt0 access from within jail # the tape drive is now on a remote server # Spool Data = yes Spool Attributes = yes Maximum Concurrent Jobs = 40 Selection Type = SQL Query Selection Pattern = " SELECT DISTINCT J.JobId, J.StartTime FROM Job J, Pool P WHERE P.Name IN ('FullFile', 'MonthlyBackups') AND P.PoolId = J.PoolId AND J.Type = 'B' AND J.JobStatus IN ('T','W') AND J.jobBytes > 0 AND J.JobId NOT IN (SELECT PriorJobId FROM Job WHERE Type IN ('B','C') AND Job.JobStatus IN ('T','W') AND PriorJobId != 0 AND Job.PoolId IN (SELECT pool.poolid FROM pool WHERE name = 'FullsLTO4')) AND J.endtime > current_timestamp - interval '110 day' ORDER BY J.StartTime " }
Let’s look at the output of that SQL query to see what we have. Let’s also add in JobBtyes to see how large these backups are.
The query
I use SQL to determine what jobs to copy. It gives me more flexibility. Here is what would be copied, and in this order, should I run that job now:
SELECT DISTINCT J.JobId, J.StartTime, J.jobbytes, pg_size_pretty(J.JobBytes) FROM Job J, Pool P WHERE P.Name IN ('FullFile', 'MonthlyBackups') AND P.PoolId = J.PoolId AND J.Type = 'B' AND J.JobStatus IN ('T','W') AND J.jobBytes > 0 AND J.JobId NOT IN (SELECT PriorJobId FROM Job WHERE Type IN ('B','C') AND Job.JobStatus IN ('T','W') AND PriorJobId != 0 AND Job.PoolId IN (SELECT pool.poolid FROM pool WHERE name = 'FullsLTO4')) AND J.endtime > current_timestamp - interval '110 day' ORDER BY J.StartTime; jobid | starttime | jobbytes | pg_size_pretty --------+---------------------+--------------+---------------- 264318 | 2017-09-03 03:04:05 | 12931379115 | 12 GB 264317 | 2017-09-03 03:04:06 | 303397514752 | 283 GB 264316 | 2017-09-03 03:04:07 | 612572982936 | 571 GB 264328 | 2017-09-03 03:05:01 | 37254946 | 36 MB 264321 | 2017-09-03 03:05:02 | 509798405 | 486 MB 264327 | 2017-09-03 03:05:02 | 215605 | 211 kB 264329 | 2017-09-03 03:05:02 | 375857649 | 358 MB 264333 | 2017-09-03 03:05:02 | 62354023 | 59 MB 264319 | 2017-09-03 03:05:03 | 537822887 | 513 MB 264331 | 2017-09-03 03:05:03 | 22460534 | 21 MB 264332 | 2017-09-03 03:06:23 | 1723428153 | 1644 MB 264334 | 2017-09-03 03:06:23 | 9671149409 | 9223 MB 264330 | 2017-09-03 03:10:43 | 2183339517 | 2082 MB 264322 | 2017-09-03 03:14:49 | 10758898 | 10 MB 264323 | 2017-09-03 03:16:09 | 6304145819 | 6012 MB 264320 | 2017-09-03 03:17:16 | 104445006754 | 97 GB 264326 | 2017-09-03 05:00:59 | 5497202777 | 5243 MB 264335 | 2017-09-03 15:14:50 | 1421915823 | 1356 MB 264336 | 2017-09-03 15:44:58 | 12155344299 | 11 GB 264324 | 2017-09-03 21:41:27 | 787287519 | 751 MB 264325 | 2017-09-03 21:43:32 | 6297826581 | 6006 MB 264348 | 2017-09-03 23:31:00 | 183112981978 | 171 GB 264338 | 2017-09-04 01:13:35 | 10501648063 | 10015 MB 264337 | 2017-09-04 06:14:48 | 63896895900 | 60 GB (24 rows) bacula=#
Typically, one last straggler job is left running for some time, because all the other jobs finish. I like to keep as many concurrent jobs running as I can, in an attempt to saturate the tape and avoid stop/start pulses as the tape runs out of data.
Let’s try another order instead.
Sort by size
SELECT DISTINCT J.JobId, J.StartTime, J.jobbytes, pg_size_pretty(J.JobBytes) FROM Job J, Pool P WHERE P.Name IN ('FullFile', 'MonthlyBackups') AND P.PoolId = J.PoolId AND J.Type = 'B' AND J.JobStatus IN ('T','W') AND J.jobBytes > 0 AND J.JobId NOT IN (SELECT PriorJobId FROM Job WHERE Type IN ('B','C') AND Job.JobStatus IN ('T','W') AND PriorJobId != 0 AND Job.PoolId IN (SELECT pool.poolid FROM pool WHERE name = 'FullsLTO4')) AND J.endtime > current_timestamp - interval '110 day' ORDER BY J.JobBytes DESC; jobid | starttime | jobbytes | pg_size_pretty --------+---------------------+--------------+---------------- 264316 | 2017-09-03 03:04:07 | 612572982936 | 571 GB 264317 | 2017-09-03 03:04:06 | 303397514752 | 283 GB 264348 | 2017-09-03 23:31:00 | 183112981978 | 171 GB 264320 | 2017-09-03 03:17:16 | 104445006754 | 97 GB 264337 | 2017-09-04 06:14:48 | 63896895900 | 60 GB 264318 | 2017-09-03 03:04:05 | 12931379115 | 12 GB 264336 | 2017-09-03 15:44:58 | 12155344299 | 11 GB 264338 | 2017-09-04 01:13:35 | 10501648063 | 10015 MB 264334 | 2017-09-03 03:06:23 | 9671149409 | 9223 MB 264323 | 2017-09-03 03:16:09 | 6304145819 | 6012 MB 264325 | 2017-09-03 21:43:32 | 6297826581 | 6006 MB 264326 | 2017-09-03 05:00:59 | 5497202777 | 5243 MB 264330 | 2017-09-03 03:10:43 | 2183339517 | 2082 MB 264332 | 2017-09-03 03:06:23 | 1723428153 | 1644 MB 264335 | 2017-09-03 15:14:50 | 1421915823 | 1356 MB 264324 | 2017-09-03 21:41:27 | 787287519 | 751 MB 264319 | 2017-09-03 03:05:03 | 537822887 | 513 MB 264321 | 2017-09-03 03:05:02 | 509798405 | 486 MB 264329 | 2017-09-03 03:05:02 | 375857649 | 358 MB 264333 | 2017-09-03 03:05:02 | 62354023 | 59 MB 264328 | 2017-09-03 03:05:01 | 37254946 | 36 MB 264331 | 2017-09-03 03:05:03 | 22460534 | 21 MB 264322 | 2017-09-03 03:14:49 | 10758898 | 10 MB 264327 | 2017-09-03 03:05:02 | 215605 | 211 kB (24 rows)
I am going to try this order, mostly because I hope to saturate the 10G fiber connection between the bacula-sd (where the backups are on disk in pool FullFile) and the bacula-sd (r610, where the tape library is located).
I just started this job. I will monitor progress and report back with any issues.
Update 1
The job now writing to tape are:
Running Jobs: Writing: Incremental Backup job slocum_jail_snapshots JobId=264380 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=407 Bytes=43,766,102 AveBytes/sec=7,294,350 LastBytes/sec=7,294,350 FDReadSeqNo=2,982 in_msg=2206 out_msg=5 fd=23 Writing: Incremental Backup job knew_jail_snapshots JobId=264382 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=1,409 Bytes=43,161,404 AveBytes/sec=7,193,567 LastBytes/sec=7,193,567 FDReadSeqNo=7,937 in_msg=5459 out_msg=5 fd=28 Writing: Incremental Backup job dent JobId=264384 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=672 Bytes=40,650,772 AveBytes/sec=6,775,128 LastBytes/sec=6,775,128 FDReadSeqNo=5,782 in_msg=4047 out_msg=5 fd=30 Writing: Incremental Backup job BackupCatalog JobId=264388 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=1 Bytes=42,860,675 AveBytes/sec=7,143,445 LastBytes/sec=7,143,445 FDReadSeqNo=664 in_msg=663 out_msg=5 fd=29 Writing: Incremental Backup job zuul_jail_snapshots JobId=264386 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=119 Bytes=41,492,440 AveBytes/sec=6,915,406 LastBytes/sec=6,915,406 FDReadSeqNo=1,223 in_msg=1026 out_msg=5 fd=33 Writing: Incremental Backup job supernews_FP_msgs JobId=264390 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=2,728 Bytes=41,563,952 AveBytes/sec=6,927,325 LastBytes/sec=6,927,325 FDReadSeqNo=24,710 in_msg=16540 out_msg=5 fd=32 Writing: Incremental Backup job supernews JobId=264392 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=2,608 Bytes=41,647,917 AveBytes/sec=6,941,319 LastBytes/sec=6,941,319 FDReadSeqNo=15,841 in_msg=10613 out_msg=5 fd=31 Writing: Incremental Backup job svn_everything JobId=264396 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=874 Bytes=43,871,136 AveBytes/sec=7,311,856 LastBytes/sec=7,311,856 FDReadSeqNo=7,700 in_msg=5343 out_msg=5 fd=22 Writing: Incremental Backup job mailjail_snapshot JobId=264394 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=472 Bytes=39,419,836 AveBytes/sec=6,569,972 LastBytes/sec=6,569,972 FDReadSeqNo=4,508 in_msg=3188 out_msg=5 fd=24 Writing: Incremental Backup job slocum_home JobId=264400 Volume="000031L4" pool="FullsLTO4" device="LTO_0" (/dev/nsa0) spooling=0 despooling=0 despool_wait=0 Files=54 Bytes=40,756,369 AveBytes/sec=6,792,728 LastBytes/sec=6,792,728 FDReadSeqNo=954 in_msg=843 out_msg=5 fd=25 ====