I recently wrote about copying jobs with a very good backup tool. My goal was to keep copies of the backups on both disk and tape. Since then, I started playing around with a little webpage to keep tabs on what’s in my tape libary. Today I noticed a little problem. I have backups from 75 days ago that are waiting to be copied to tape.
I’ll just let you think about why this happened. Yesterday, this queue held jobs that were about 16 hours old. How did this old job get into the queue?
To answer this question, we need to start with the query used to display this information:
Pool | Size | Count | age |
---|---|---|---|
DiffFile | 636 MB | 1 | 11:15:56 |
FullFile | 2627 MB | 2 | 75 days 16:49:56 |
IncrFile | 16 GB | 7 | 17:30:57 |
SELECT P.Name as name, pg_size_pretty( sum( coalesce( J.jobbytes, 0 ))::bigint) as size, count(J.jobid) as count, date_trunc( 'second', current_timestamp - min(J.endtime) ) as max_wait FROM pool P LEFT OUTER JOIN job J ON P.PoolId = J.PoolId AND J.Type = 'B' AND J.JobStatus IN ('T','W') AND J.jobBytes > 0 AND J.JobId NOT IN (SELECT PriorJobId FROM Job WHERE Type IN ('B','C') AND Job.JobStatus IN ('T','W') AND PriorJobId != 0) WHERE P.name IN GROUP BY 1 ORDER BY P.name
Jobs that are to be copied to disk are contained within the three specified pools (‘IncrFile’ , ‘DiffFile’, ‘FullFile’). The query looks for any job in those pools that do not appear as a PriorJobId in another job.
Thus, the problem must be with the copy jobs; that is the jobid of the jobs that need to be copied no longer appear in another job’s PriorJobId field.
Thus, the copy jobs must have been removed from the database. This can and does happen as part of Bacula’s pruning processes. But I wanted to confirm my suspicions.
I restored a backup of my database from two days ago and ran some queries.
The two jobs waiting to be copied are:
bacula=# select jobid, name, schedtime, endtime, PriorJobId from job where jobid in (40533, 47048); jobid | name | schedtime | endtime | priorjobid -------+---------------------+---------------------+---------------------+------------ 40533 | nyi maildir tarball | 2010-12-03 05:55:02 | 2010-12-03 07:30:04 | 0 47048 | nyi maildir tarball | 2011-01-18 05:55:02 | 2011-01-18 07:36:33 | 0 (2 rows) bacula=#
Do those jobs appear in a priorjobid field?
bacula=# select jobid, name, schedtime, endtime, PriorJobId from job where priorjobid in (40533, 47048); jobid | name | schedtime | endtime | priorjobid -------+---------------------+---------------------+---------------------+------------ 49612 | nyi maildir tarball | 2011-01-31 23:42:53 | 2010-12-03 07:30:04 | 40533 49973 | nyi maildir tarball | 2011-02-01 01:23:46 | 2010-12-03 07:30:04 | 40533 (2 rows) bacula=#
Yes, one does. And twice. It seems this situation has occurred before, but without my recently created webpage tool, I missed it.
Now let’s run that last query on the database from a few days ago:
bacula=# select jobid, name, schedtime, endtime, PriorJobId from job where priorjobid in (40533, 47048); jobid | name | schedtime | endtime | priorjobid -------+---------------------+---------------------+---------------------+------------ 49612 | nyi maildir tarball | 2011-01-31 23:42:53 | 2010-12-03 07:30:04 | 40533 47142 | nyi maildir tarball | 2011-01-18 16:47:54 | 2011-01-18 07:36:33 | 47048 49973 | nyi maildir tarball | 2011-02-01 01:23:46 | 2010-12-03 07:30:04 | 40533 50137 | nyi maildir tarball | 2011-02-01 01:59:32 | 2010-12-03 07:30:04 | 40533 (4 rows) bacula=#
OK, there’s four jobs. What tapes did those jobs go onto?
bacula=# select mediaid, volumename, lastwritten, voljobs, firstwritten from media where mediaid in bacula-# (select mediaid from jobmedia where jobid in bacula(# (select jobid from job where priorjobid in (40533, 47048))); mediaid | volumename | lastwritten | voljobs | firstwritten ---------+------------+---------------------+---------+--------------------- 1292 | ETU017 | 2011-02-01 02:07:00 | 2 | 2011-01-06 08:19:57 1297 | ETU012 | 2011-01-26 09:37:54 | 65 | 2011-01-17 13:15:03 (2 rows) bacula=#
Oh. I recognize ETU017. I put that tape into the tape library this morning. It had been recycled. All previous data on that tape has been overwritten. This is why the 75-day old job is again in the queue.
OK, but why… why was that tape recycled. What is the retention period on that volume?
bacula=# select volretention / 24 / 60 /60 from media where volumename = 'ETU017'; ?column? ---------- 28 (1 row)
That’s 28 days. OK. That explains the recycling of that volume. The jobs in question were all more than 28 days old:
bacula=# select jobid, endtime from job where priorjobid in (40533, 47048); jobid | endtime -------+--------------------- 49612 | 2010-12-03 07:30:04 47142 | 2011-01-18 07:36:33 49973 | 2010-12-03 07:30:04 50137 | 2010-12-03 07:30:04
OK, so now I’m feeling better.
The problem is the retention of the backups on disk.
bacula=# select mediaid, volumename, volretention / 24 / 60 / 60 from media where mediaid in bacula-# (select mediaid from jobmedia where jobid in bacula(# (select jobid from job where jobid in (40533, 47048))); mediaid | volumename | ?column? ---------+---------------+---------- 1361 | FullAuto-1361 | 1095 1514 | FullAuto-1514 | 1095 1518 | FullAuto-1518 | 1095 (3 rows) bacula=#
That’s 3 years. They are retained much longer than the backups on tape. I think the solution is to match the retention periods do the copy-to-tape is not repeated. How I do this will require a bit of thought. I won’t be doing it tonight.
[later that day]
Yes, I am looking into it. What a mess.
bacula=# SELECT bacula-# job.jobid, bacula-# pool."name", bacula-# job."level", bacula-# job."name", bacula-# job.endtime, bacula-# media.volumename, extract ( days from age(Job.endtime )) as max_wait bacula-# FROM bacula-# public.pool, bacula-# public.job, bacula-# public.jobmedia, bacula-# public.media bacula-# WHERE bacula-# job.jobid = jobmedia.jobid AND bacula-# media.mediaid = jobmedia.mediaid AND bacula-# media.poolid = pool.poolid AND bacula-# pool."name" IN ('Incrementals', 'Differentials', 'Fulls') bacula-# AND bacula-# ((pool."name" = 'Fulls' and job.level != 'F') OR bacula(# (pool.name = 'Incrementals' and job.level != 'I') OR bacula(# (pool.name = 'Differentials' and job.level != 'D')) bacula-# ORDER by jobid; jobid | name | level | name | endtime | volumename | max_wait -------+--------------+-------+---------------------+---------------------+------------+---------- 40152 | Fulls | I | kraken tape | 2010-11-24 06:14:42 | ETU006 | 22 40166 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40168 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40170 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40172 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40175 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40177 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40179 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40181 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40183 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40185 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40201 | Fulls | I | kraken tape | 2010-11-25 05:59:22 | ETU006 | 21 40215 | Fulls | I | kraken tape | 2010-11-26 00:55:43 | ETU006 | 20 40217 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU006 | 22 40228 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40230 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40244 | Fulls | I | kraken tape | 2010-11-26 05:59:18 | ETU004 | 20 40256 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40258 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40260 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40262 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40264 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40266 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40268 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40270 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40272 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40274 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40276 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40278 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40280 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40282 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40284 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40286 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40288 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40290 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40292 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40294 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40296 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40298 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40300 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40302 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40304 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40306 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40308 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40310 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40312 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40314 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40316 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40318 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40320 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40322 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40324 | Fulls | I | ngaio | 2010-11-24 16:47:51 | ETU004 | 22 40339 | Fulls | I | kraken tape | 2010-11-27 05:59:16 | ETU004 | 19 40366 | Fulls | D | kraken tape | 2010-11-28 06:00:24 | ETU004 | 18 40393 | Fulls | I | kraken tape | 2010-11-29 05:59:18 | ETU004 | 17 40420 | Fulls | I | kraken tape | 2010-11-30 05:59:17 | ETU004 | 16 40447 | Fulls | I | kraken tape | 2010-12-01 05:59:30 | ETU004 | 15 40476 | Fulls | I | kraken tape | 2010-12-02 06:00:51 | ETU004 | 14 47324 | Incrementals | F | bast home | 2011-01-20 06:02:51 | ETU011 | 27 47326 | Incrementals | F | nyi maildir tarball | 2011-01-20 07:42:35 | ETU011 | 27 47326 | Incrementals | F | nyi maildir tarball | 2011-01-20 07:42:35 | ETU011 | 27 47430 | Incrementals | F | nyi maildir tarball | 2011-01-21 07:36:35 | ETU011 | 26 47430 | Incrementals | F | nyi maildir tarball | 2011-01-21 07:36:35 | ETU011 | 26 47577 | Incrementals | F | nyi maildir tarball | 2011-01-22 07:36:13 | ETU000 | 25 47577 | Incrementals | F | nyi maildir tarball | 2011-01-22 07:36:13 | ETU000 | 25 47625 | Incrementals | D | wocker | 2011-01-23 05:59:52 | ETU000 | 24 47627 | Incrementals | D | bast basic | 2011-01-23 06:00:37 | ETU000 | 24 47629 | Incrementals | D | bast home | 2011-01-23 06:00:59 | ETU000 | 24 47631 | Incrementals | D | dbclone basic | 2011-01-23 06:01:14 | ETU000 | 24 47633 | Incrementals | D | dbclone home | 2011-01-23 06:05:40 | ETU000 | 24 47633 | Incrementals | D | dbclone home | 2011-01-23 06:05:40 | ETU000 | 24 47635 | Incrementals | D | polo basic | 2011-01-23 06:05:54 | ETU000 | 24 47637 | Incrementals | D | polo | 2011-01-23 06:14:40 | ETU000 | 24 47639 | Incrementals | D | kraken basic | 2011-01-23 06:14:50 | ETU000 | 24 47645 | Incrementals | D | wocker | 2011-01-23 05:59:52 | ETU000 | 24 47647 | Incrementals | D | bast basic | 2011-01-23 06:00:37 | ETU000 | 24 47649 | Incrementals | D | bast home | 2011-01-23 06:00:59 | ETU000 | 24 47651 | Incrementals | D | dbclone basic | 2011-01-23 06:01:14 | ETU000 | 24 47653 | Incrementals | D | dbclone home | 2011-01-23 06:05:40 | ETU000 | 24 47653 | Incrementals | D | dbclone home | 2011-01-23 06:05:40 | ETU000 | 24 47655 | Incrementals | D | polo basic | 2011-01-23 06:05:54 | ETU000 | 24 47657 | Incrementals | D | polo | 2011-01-23 06:14:40 | ETU000 | 24 47659 | Incrementals | D | kraken basic | 2011-01-23 06:14:50 | ETU000 | 24 47661 | Incrementals | D | kraken | 2011-01-23 06:14:54 | ETU000 | 24 47663 | Incrementals | D | ngaio basic | 2011-01-23 06:15:57 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47665 | Incrementals | D | ngaio databases | 2011-01-23 06:48:27 | ETU000 | 24 47667 | Incrementals | D | ngaio | 2011-01-23 06:50:03 | ETU000 | 24 47669 | Incrementals | D | nyi basic | 2011-01-23 07:44:10 | ETU000 | 24 47671 | Incrementals | D | nyi | 2011-01-23 07:49:55 | ETU000 | 24 47673 | Incrementals | D | nyi FP msgs | 2011-01-23 07:56:41 | ETU000 | 24 47675 | Incrementals | D | supernews FP msgs | 2011-01-23 07:59:42 | ETU000 | 24 47677 | Incrementals | D | supernews basic | 2011-01-23 08:06:02 | ETU000 | 24 47679 | Incrementals | D | supernews | 2011-01-23 08:10:54 | ETU000 | 24 47681 | Incrementals | D | latens basic | 2011-01-23 08:13:40 | ETU000 | 24 47695 | Incrementals | F | nyi maildir tarball | 2011-01-23 07:41:50 | ETU000 | 24 47695 | Incrementals | F | nyi maildir tarball | 2011-01-23 07:41:50 | ETU000 | 24 47848 | Incrementals | F | nyi maildir tarball | 2011-01-24 07:55:29 | ETU032 | 23 47848 | Incrementals | F | nyi maildir tarball | 2011-01-24 07:55:29 | ETU032 | 23 48397 | Incrementals | F | nyi maildir tarball | 2011-01-26 08:13:27 | ETU033 | 21 48397 | Incrementals | F | nyi maildir tarball | 2011-01-26 08:13:27 | ETU033 | 21 48668 | Incrementals | F | nyi maildir tarball | 2011-01-27 08:14:17 | ETU031 | 20 48668 | Incrementals | F | nyi maildir tarball | 2011-01-27 08:14:17 | ETU031 | 20 48838 | Incrementals | F | nyi maildir tarball | 2011-01-28 08:20:01 | ETU031 | 19 48838 | Incrementals | F | nyi maildir tarball | 2011-01-28 08:20:01 | ETU031 | 19 48913 | Incrementals | F | BackupCatalog | 2011-01-28 08:56:36 | ETU031 | 19 48913 | Incrementals | F | BackupCatalog | 2011-01-28 08:56:36 | ETU031 | 19 48915 | Incrementals | F | nyi maildir tarball | 2011-01-29 08:11:21 | ETU031 | 18 48915 | Incrementals | F | nyi maildir tarball | 2011-01-29 08:11:21 | ETU031 | 18 48917 | Incrementals | F | BackupCatalog | 2011-01-29 08:45:41 | ETU031 | 18 48917 | Incrementals | F | BackupCatalog | 2011-01-29 08:45:41 | ETU031 | 18 48919 | Incrementals | F | nyi maildir tarball | 2011-01-30 08:10:37 | ETU031 | 17 48919 | Incrementals | F | nyi maildir tarball | 2011-01-30 08:10:37 | ETU031 | 17 48924 | Incrementals | F | BackupCatalog | 2011-01-28 08:56:36 | ETU031 | 19 48924 | Incrementals | F | BackupCatalog | 2011-01-28 08:56:36 | ETU031 | 19 48926 | Incrementals | F | nyi maildir tarball | 2011-01-29 08:11:21 | ETU031 | 18 48926 | Incrementals | F | nyi maildir tarball | 2011-01-29 08:11:21 | ETU031 | 18 50344 | Fulls | I | ngaio | 2011-01-11 06:56:29 | ETU003 | 5 50346 | Fulls | I | nyi basic | 2011-01-11 07:37:56 | ETU003 | 5 51341 | Fulls | D | nyi maildir | 2011-01-30 12:34:32 | ETU003 | 17 51413 | Fulls | I | nyi maildir | 2011-02-03 16:16:53 | ETU003 | 13 51417 | Fulls | D | nyi maildir | 2011-02-03 14:52:01 | ETU003 | 13 51630 | Fulls | D | nyi maildir | 2011-02-04 12:41:41 | ETU009 | 12 51682 | Fulls | I | nyi maildir | 2011-02-04 14:05:39 | ETU009 | 12 51685 | Fulls | I | nyi maildir | 2011-02-04 16:05:29 | ETU009 | 12 51688 | Fulls | I | nyi maildir | 2011-02-04 18:05:23 | ETU009 | 12 51697 | Fulls | I | nyi maildir | 2011-02-05 00:08:57 | ETU009 | 11 51766 | Fulls | D | nyi maildir | 2011-02-06 15:34:02 | ETU009 | 10 51840 | Fulls | I | nyi maildir | 2011-02-06 18:06:22 | ETU009 | 10 51843 | Fulls | I | nyi maildir | 2011-02-06 20:21:51 | ETU009 | 10 51897 | Fulls | I | nyi maildir | 2011-02-07 01:08:57 | ETU005 | 9 51900 | Fulls | I | nyi maildir | 2011-02-07 02:05:44 | ETU005 | 9 51972 | Fulls | I | nyi maildir | 2011-02-07 16:00:15 | ETU005 | 9 51974 | Fulls | D | nyi maildir | 2011-02-07 12:28:08 | ETU005 | 9 51990 | Fulls | I | nyi maildir | 2011-02-08 00:00:19 | ETU005 | 8 51997 | Fulls | I | nyi maildir | 2011-02-08 02:00:11 | ETU005 | 8 52063 | Fulls | I | nyi maildir | 2011-02-08 08:01:29 | ETU005 | 8 52073 | Fulls | D | nyi maildir | 2011-02-08 12:19:59 | ETU005 | 8 52077 | Fulls | I | nyi maildir | 2011-02-08 14:00:32 | ETU005 | 8 52161 | Fulls | I | nyi maildir | 2011-02-09 08:03:22 | ETU005 | 7 52249 | Fulls | I | supernews basic | 2011-02-10 08:29:15 | ETU005 | 6 52251 | Fulls | D | nyi maildir | 2011-02-09 12:18:46 | ETU005 | 7 52350 | Fulls | I | nyi maildir | 2011-02-11 08:01:35 | ETU005 | 5 52352 | Fulls | D | nyi maildir | 2011-02-10 12:19:54 | ETU005 | 6 52453 | Fulls | I | nyi maildir | 2011-02-12 08:09:39 | ETU005 | 4 52455 | Fulls | D | nyi maildir | 2011-02-11 12:18:52 | ETU005 | 5 52514 | Fulls | I | nyi maildir | 2011-02-13 08:01:20 | ETU005 | 3 52548 | Fulls | D | latens basic | 2011-02-13 07:50:53 | ETU005 | 3 52640 | Fulls | I | nyi maildir | 2011-02-14 08:30:11 | ETU005 | 2 52642 | Fulls | D | nyi maildir | 2011-02-13 12:25:07 | ETU005 | 3 52735 | Fulls | I | nyi maildir | 2011-02-15 08:08:33 | ETU005 | 1 52737 | Fulls | D | nyi maildir | 2011-02-14 12:24:33 | ETU005 | 2 52834 | Fulls | I | nyi maildir | 2011-02-16 08:08:35 | ETU005 | 0 52836 | Fulls | D | nyi maildir | 2011-02-15 12:21:20 | ETU005 | 1 (170 rows) bacula=#
I’ve got incremental and differentials in my full pool, and fulls and differentials in my incrementals. Some of these are quite recent (today). I think a close examination of my bacula-dir.conf file is in order.
After looking a bit closers at the pools and retention periods, I discovered that that the backups in question were in the wrong pool. The backups were FULL backups. The tape they were copied to is in the Incrementals pool. I suspect this happened because of the nature of my previous Copy-To-Tape strategy and perhaps during some of my experiments. You will recall these backups are 75 days old.
I’ve added this query to my services website so I can track it. I think I’ve fixed the wrong-pool problem.
https://services.unixathome.org/bacula/wrong-pool.php