Bacula helps me solve a mergemaster problem which broke Bacula

Earlier today, I saw a bunch of emails like this:

for <root@localhos>; Wed, 21 Jan 2015 20:32:36 +0000 (UTC)
From: root@localhost.example.org (Bacula)
Subject: Bacula: Copy Fatal Error of crey-fd Full
Sender: bacula@bacula.example.org
To: root@localhost.example.org
Date: Wed, 21 Jan 2015 20:32:36 +0000 (UTC)
Message-Id: <20150121203236.E993D2B28@bacula.example.org>
X-Gm-Original-To: dan@example.org

21-Jan 20:32 bacula-dir JobId 197313: Warning: FileSet MD5 digest not found.
21-Jan 20:32 bacula-dir JobId 197313: The following 1 JobId was chosen to be copied: 197238
21-Jan 20:32 bacula-dir JobId 197313: Copying using JobId=197238 Job=fileserver_basic.2015-01-21_05.55.00_15
21-Jan 20:32 bacula-dir JobId 197313: Bootstrap records written to /usr/local/bacula/working/bacula-dir.restore.152.bsr
21-Jan 20:32 bacula-dir JobId 197313: Start Copying JobId 197313, Job=CopyToTape-Inc-Just-One.2015-01-21_20.32.33_31
21-Jan 20:32 bacula-dir JobId 197313: Using Device "CreyFile" to read.
21-Jan 20:32 crey-sd JobId 197313: Ready to read from volume "IncrAuto-4354" on file device "CreyFile" (/usr/local/bacula/volumes).
21-Jan 20:32 crey-sd JobId 197313: Forward spacing Volume "IncrAuto-4354" to file:block 0:3419975122.
21-Jan 20:32 crey-sd JobId 197313: Error: bsock.c:432 Wrote 65540 bytes to client:Hostname not found:9103, but only 16384 accepted.
21-Jan 20:32 crey-sd JobId 197313: Fatal error: read.c:267 Error sending to File daemon. ERR=Broken pipe
21-Jan 20:32 crey-sd JobId 197313: Elapsed time=00:00:01, Transfer rate=1.362 M Bytes/second
21-Jan 20:32 crey-sd JobId 197313: Error: bsock.c:370 Socket has errors=1 on call to client:Hostname not found:9103
21-Jan 20:32 crey-sd JobId 197313: Fatal error: fd_cmds.c:138 Read data not accepted
21-Jan 20:32 crey-sd JobId 197313: Error: bsock.c:370 Socket has errors=1 on call to client:Hostname not found:9103
21-Jan 20:32 bacula-dir JobId 197313: Error: Bacula bacula-dir 7.0.5 (28Jul14):
  Build OS:               amd64-portbld-freebsd9.3 freebsd 9.3-RELEASE-p7
  Prev Backup JobId:      197238
  Prev Backup Job:        fileserver_basic.2015-01-21_05.55.00_15
  New Backup JobId:       197314
  Current JobId:          197313
  Current Job:            CopyToTape-Inc-Just-One.2015-01-21_20.32.33_31
  Backup Level:           Full
  Client:                 crey-fd
  FileSet:                "EmptyCopyToTape" 2011-02-20 20:53:31
  Read Pool:              "IncrFile" (From Job resource)
  Read Storage:           "CreyFile" (From Pool resource)
  Write Pool:             "Incrementals" (From Pool's NextPool resource)
  Write Storage:          "CompaqStorageWorksTapeLibrary" (From Pool's NextPool resource)
  Catalog:                "MyCatalog" (From Client resource)
  Start time:             21-Jan-2015 20:32:36
  End time:               21-Jan-2015 20:32:36
  Elapsed time:           0 secs
  Priority:               410
  SD Files Written:       21
  SD Bytes Written:       1,362,218 (1.362 MB)
  Rate:                   0.0 KB/s
  Volume name(s):         
  Volume Session Id:      2
  Volume Session Time:    1421869857
  Last Volume Bytes:      0 (0 B)
  SD Errors:              3
  SD termination status:  Error
  Termination:            *** Copying Error ***

This meant all my copy jobs were failing.

I started looking at permissions because I had recently run mergemaster on the jail in which bacula-sd ran.

No luck. That all looked good.

Eventually, I look at groups.

$ id bacula
uid=910(bacula) gid=910(bacula) groups=910(bacula)

That’s it; bacula is not part of the operator group.

To confirm this, I restored /etc/group from a few days ago. Then I did a diff:

[root@knew:/tmp/bacula-restores/usr/jails/crey/.zfs/snapshot/snapshot-for-backup/etc] # diff group /usr/jails/crey/etc/group
1c1
< # $FreeBSD: release/9.1.0/etc/group 218046 2011-01-28 22:28:12Z pjd $
---
> # $FreeBSD: releng/9.3/etc/group 256102 2013-10-07 08:19:41Z des $
8c8
< operator:*:5:root,bacula
---
> operator:*:5:root

BINGO!

That is confirmation. I amended the file, restarted the jail, and everything was back to normal.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top