Earlier today, I noticed the following output from a Bacula job:
24-Sep 14:14 bacula-dir JobId 38548: Start Backup JobId 38548, Job=latens_home.2010-09-24_14.12.38_31 24-Sep 14:14 bacula-dir JobId 38548: Using Device "MegaFile-latens" 24-Sep 14:09 latens-fd JobId 38548: DIR and FD clocks differ by -307 seconds, FD automatically compensating.
That’s 5 minutes. It shouldn’t be varying by that much.
So I started ntp. That’s when I noticed it was not being started by /etc/rc.conf. But I thought not much more of it. Later, I thought: let’s add that to my Nagios configuration. First stop: check nrpe2.cfg and see if there’s an entry for check_ntp. There was. Oh… well, let’s check Nagios. Oh, ntpd is already monitored. OK, what’s going on here?
Let’s see what the check returns when ntpd is stopped:
$ /usr/local/libexec/nagios/check_nrpe2 -H latens-vpn -c check_ntp /var/run/nrpe2.pid OK - 974 ?? Ss 3:20.45 /usr/local/sbin/nrpe2 -d -c /usr/local/etc/nrpe.cfg
Hmm, it says it’s running. Let’s see what’s on the client:
$ grep ntp /usr/local/etc/nrpe.cfg command[check_ntp]=/usr/local/libexec/nagios/check_pid_sudo /var/run/nrpe2.pid
Oh, that’s the wrong pid file. This is checking nrpe, not ntpd. The correct command is:
command[check_ntp]=/usr/local/libexec/nagios/check_pid_sudo /var/run/ntpd.pid
After amending the sudoers file to account for the corrected command, things run correctly on the client:
$ /usr/local/libexec/nagios/check_pid_sudo /var/run/ntpd.pid /var/run/ntpd.pid OK - 81358 ?? Ss 0:00.02 /usr/sbin/ntpd -c /etc/ntp.conf -p /var/run/ntpd.pid -f /var/dbpd.drift
And on the Nagios server:
$ /usr/local/libexec/nagios/check_nrpe2 -H latens-vpn -c check_ntp /var/run/ntpd.pid OK - 81358 ?? Ss 0:00.02 /usr/sbin/ntpd -c /etc/ntp.conf -p /var/run/ntpd.pid -f /var/db/ntpd.drift
I checked the output of all other ntpd checks within Nagios. I found no other references to nrpe….
So is that a problem with the check_ntp script as distributed, or a mistake that has been user-induced?
The problem was with me. I added the incorrect line to nrpe2.cfg.
I think what I needed to do was:
command[check_ntp]=/usr/local/libexec/nagios/check_pid_sudo /var/run/ntpd.pid
At least, that’s what’s running now.