Last week, while at EuroBSDCon in Malta, I noticed that one of my servers had the wrong time. It was Bacula who told me, through this message in one of the backup jobs:
28-Sep 21:59 nyi-fd JobId 144899: DIR and FD clocks differ by -5 seconds, FD automatically compensating
Fixing the time
I connected to all my systems, and ran date(1). One system was by 2 seconds, and another was off by 5 seconds. I mistakenly thought the problem might be the upstream ntp servers I was using. I altered /etc/ntp.conf and stopped using the freebsd.pool.ntp.org servers and started using the us.pool.ntp.org servers instead.
Then I restarted ntp and looked at the log file:
Sep 29 06:29:59 slocum ntpd[1353]: ntpd exiting on signal 15 Sep 29 06:29:59 slocum ntpd[1779]: ntpd 4.2.4p5-a (1) Sep 29 06:30:00 slocum ntpd[1780]: sendto(18.85.44.118) (fd=22): Operation not permitted Sep 29 06:30:01 slocum ntpd[1780]: sendto(72.8.140.240) (fd=22): Operation not permitted Sep 29 06:30:02 slocum ntpd[1780]: sendto(199.102.46.72) (fd=22): Operation not permitted
Oh. Umm. That is clearly the firewall. I added the following rules to /etc/pf.conf:
# get ntp working pass out quick on $EXT_IF inet proto tcp from any to any port ntp flags S/SA keep state pass out quick on $EXT_IF inet proto udp from any to any port ntp keep state
Where EXT_IF is set to the NIC you are using (for me, em0).
That fixed it that server, which was running FreeBSD 9.1-RELEASE.
The other server didn’t show the same errors, and was running FreeBSD 9.1-RELEASE-p6. As I’m typing this, I do not recall how I got that server time corrected.
Checking the time
In case this problem occurred again, I wanted to find out about it as soon as possible. I started searching for Nagios plugins.
I found http://exchange.nagios.org/directory/Plugins/Network-Protocols/NTP-and-Time/check_daytime/details but I went with check_ntp_time from nagios_plugins from the FreeBSD ports tree. This plugin will compare the time on the Nagios server to that of the server you are checking. I’m looking for any changes more than 0.5s.
I already had an NTP host group in Nagios, so it was relatively simple to add a check_time to the host group. By taking that approach, each host in the host group then had that check added to their Nagios regime.