After moving the poudriere jail (pkg01) to the new host (r730-01), I noticed this message from Nagios:
That “email found in . /var/mail/dan” message is significant. In general mail on my hosts/jails is not delivered locally. It all goes off-host. That’s why I have this Nagios check. When mail like that is found, it’s either a configuration error or something local has gone wrong.
In this post:
- FreeBSD 13.1
- poudriere-3.3.7_1
NOTE: This is the spoiler. It was space. I missed that. This also highlights the lack of monitoring in place for this newly created host. If I’d seen that alert first…. For updates, search this post for NOTE: to find the hindsight as to where I went wrong in multiple spots.
What’s in the mail?
The email was rather interesting to me.
[pkg01 dan ~] % cat /var/mail/dan From root@pkg01.int.unixathome.org Tue Feb 21 04:18:05 2023 Received: from root (uid 0) (envelope-from root@pkg01.int.unixathome.org) id 2224f by pkg01.int.unixathome.org (DragonFly Mail Agent v0.11+ on pkg01.int.unixathome.org); Tue, 21 Feb 2023 04:18:03 +0000 From: Cron Daemon <root@pkg01.int.unixathome.org> To: dan Subject: Cron <root@pkg01> /usr/bin/lockf -t 0 /tmp/.poudriere.build /root/bin/poudriere-builds.sh X-Cron-Env: <PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/usr/home/dan/bin> X-Cron-Env: <SHELL=/bin/sh> X-Cron-Env: <MAILTO=dan> X-Cron-Env: <LOGNAME=root> X-Cron-Env: <USER=root> Date: Tue, 21 Feb 2023 04:18:05 +0000 Message-Id: <63f445fd.2224f.ad6e128@pkg01.int.unixathome.org> printf: write error on stdout error: cannot open '.git/FETCH_HEAD': No space left on device [00:00:00] Error: fail printf: write error on stdout error: cannot open '.git/FETCH_HEAD': No space left on device [00:00:01] Error: fail chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device chmod: /usr/local/poudriere/data/.m: No space left on device
This is the output from this crontab:
[pkg01 dan ~] % sudo crontab -l -u root PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/usr/home/dan/bin # use /bin/sh to run commands, overriding the default set by cron SHELL=/bin/sh # mail any output to `dan', no matter whose crontab this is MAILTO=dan 18 4 * * * /usr/bin/lockf -t 0 /tmp/.poudriere.build /root/bin/poudriere-builds.sh
There’s my configuration error. I changed that MAILTO to MAILTO=dan@langille.org
Why did this work before? I had Postfix running in most jails. It is my favorite MTA and has been installed by default (by me) on most things. Lately, I have moved to using DMA instead. It is part of base-FreeBSD, is easily configured, and (so far) reliable.
Back to the new problem at hand
The problem above is easily reproduced on the command line:
[pkg01 dan ~] % sudo poudriere ports -u -p default [00:00:00] Updating portstree "default" with git...error: cannot open '.git/FETCH_HEAD': No space left on device [00:00:00] Error: fail
So what’s going on. My first idea: something is mounted readonly. No, that’s not the case.
[pkg01 dan ~] % zfs get -r -t filesystem readonly NAME PROPERTY VALUE SOURCE data01 readonly off default data01/poudriere readonly off default data01/poudriere/ccache.13amd64 readonly off default data01/poudriere/ccache.13i386 readonly off default data01/poudriere/ccache.amd64 readonly off default data01/poudriere/ccache.i386 readonly off default data01/poudriere/data readonly off default data01/poudriere/data/cache readonly off default data01/poudriere/data/cronjob-logs readonly off default data01/poudriere/data/packages readonly off default data01/poudriere/distfiles readonly off default data01/poudriere/jails readonly off default data01/poudriere/jails/114R readonly off default data01/poudriere/jails/121amd64 readonly off default data01/poudriere/jails/121i386 readonly off default data01/poudriere/jails/131amd64 readonly off default data01/poudriere/jails/131i386 readonly off default data01/poudriere/jails/13amd64 readonly off default data01/poudriere/jails/13i386 readonly off default data01/poudriere/ports readonly off default data01/poudriere/ports/default readonly off default data01/poudriere/ports/main readonly off default data01/poudriere/ports/testing readonly off default data01/poudriere/test readonly off default [pkg01 dan ~] %
There is plenty of space:
[pkg01 dan ~] % zfs list NAME USED AVAIL REFER MOUNTPOINT data01 4.01T 0B 205K none data01/poudriere 258G 0B 239K /usr/local/poudriere data01/poudriere/ccache.13amd64 205K 0B 205K /usr/local/poudriere/ccache.13amd64 data01/poudriere/ccache.13i386 205K 0B 205K /usr/local/poudriere/ccache.13i386 data01/poudriere/ccache.amd64 52.8G 0B 47.3G /usr/local/poudriere/ccache.amd64 data01/poudriere/ccache.i386 9.27G 0B 6.07G /usr/local/poudriere/ccache.i386 data01/poudriere/data 93.7G 0B 29.0G /usr/local/poudriere/data data01/poudriere/data/cache 199M 0B 69.4M /usr/local/poudriere/data/cache data01/poudriere/data/cronjob-logs 10.6M 0B 2.96M /usr/local/poudriere/data/cronjob-logs data01/poudriere/data/packages 62.9G 0B 54.0G /usr/local/poudriere/data/packages data01/poudriere/distfiles 67.6G 0B 67.5G /usr/ports/distfiles data01/poudriere/jails 13.0G 0B 239K /usr/local/poudriere/jails data01/poudriere/jails/114R 1.76G 0B 1.76G /usr/local/poudriere/jails/114R data01/poudriere/jails/121amd64 2.00G 0B 2.00G /usr/local/poudriere/jails/121amd64 data01/poudriere/jails/121i386 1.73G 0B 1.73G /usr/local/poudriere/jails/121i386 data01/poudriere/jails/131amd64 2.04G 0B 2.04G /usr/local/poudriere/jails/131amd64 data01/poudriere/jails/131i386 1.72G 0B 1.72G /usr/local/poudriere/jails/131i386 data01/poudriere/jails/13amd64 2.04G 0B 2.04G /usr/local/poudriere/jails/13amd64 data01/poudriere/jails/13i386 1.72G 0B 1.72G /usr/local/poudriere/jails/13i386 data01/poudriere/ports 21.9G 0B 188K /usr/local/poudriere/ports data01/poudriere/ports/default 5.15G 0B 3.09G /usr/local/poudriere/ports/default data01/poudriere/ports/main 2.55G 0B 2.40G /usr/local/poudriere/ports/main data01/poudriere/ports/testing 14.2G 0B 3.06G /usr/local/poudriere/ports/testing data01/poudriere/test 478K 0B 205K /usr/local/poudriere/test [pkg01 dan ~] % zpool list 13:36:01 NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT data01 5.81T 5.64T 180G - - 2% 96% 1.00x ONLINE - [pkg01 dan ~] %
NOTE: Now if I slowed down and looked the AVAIL column right there is 0B. Nothing. Nada.
I also checked quota and reservation, but nothing.
[pkg01 dan ~] % zfs get -r -t filesystem quota,refquota,reservation,refreservation,usedbyrefreservation NAME PROPERTY VALUE SOURCE data01 quota none default data01 refquota none default data01 reservation none default data01 refreservation none default data01 usedbyrefreservation 0B - data01/poudriere quota none default data01/poudriere refquota none default data01/poudriere reservation none default data01/poudriere refreservation none default ...
What next?
Try the source
Let’s go to the source of the problem: the ports tree I am trying to work with.
[pkg01 dan ~] % cd /usr/local/poudriere/ports/default [pkg01 dan /usr/local/poudriere/ports/default] % sudo touch testing touch: testing: No space left on device
It’s not poudriere. It’s definitely the filesystem. It is not a space issue.
Let’s try this from the host, instead of the jail:
[r730-01 dvl ~] % cd /jails/pkg01/usr/local/etc/poudriere.d/ports/default [r730-01 dvl /jails/pkg01/usr/local/etc/poudriere.d/ports/default] % sudo touch things [r730-01 dvl /jails/pkg01/usr/local/etc/poudriere.d/ports/default] %
It’s a jail thing.
NOTE: No, it’s not. That’s not the same place Look at the directories.
Checking the jailed property:
[r730-01 dvl ~] % zfs get -t filesystem -r jailed data01/poudriere NAME PROPERTY VALUE SOURCE data01/poudriere jailed on local data01/poudriere/ccache.13amd64 jailed on inherited from data01/poudriere data01/poudriere/ccache.13i386 jailed on inherited from data01/poudriere data01/poudriere/ccache.amd64 jailed on inherited from data01/poudriere data01/poudriere/ccache.i386 jailed on inherited from data01/poudriere data01/poudriere/data jailed on inherited from data01/poudriere data01/poudriere/data/cache jailed on inherited from data01/poudriere data01/poudriere/data/cronjob-logs jailed on inherited from data01/poudriere data01/poudriere/data/packages jailed on inherited from data01/poudriere data01/poudriere/distfiles jailed on inherited from data01/poudriere data01/poudriere/jails jailed on inherited from data01/poudriere data01/poudriere/jails/114R jailed on inherited from data01/poudriere data01/poudriere/jails/121amd64 jailed on inherited from data01/poudriere data01/poudriere/jails/121i386 jailed on inherited from data01/poudriere data01/poudriere/jails/131amd64 jailed on inherited from data01/poudriere data01/poudriere/jails/131i386 jailed on inherited from data01/poudriere data01/poudriere/jails/13amd64 jailed on inherited from data01/poudriere data01/poudriere/jails/13i386 jailed on inherited from data01/poudriere data01/poudriere/ports jailed on inherited from data01/poudriere data01/poudriere/ports/default jailed on inherited from data01/poudriere data01/poudriere/ports/main jailed on inherited from data01/poudriere data01/poudriere/ports/testing jailed on inherited from data01/poudriere data01/poudriere/test jailed on inherited from data01/poudriere [r730-01 dvl ~] %
Can the jail create a new filesystem?
[pkg01 dan ~] % sudo zfs create data01/poudriere/testing cannot create 'data01/poudriere/testing': out of space
No, same symptoms.
The jail “owns” the filesystem
This is the jailconfiguration:
pkg01 { # # start of standard settings for each jail # exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.clean; exec.consolelog="/var/tmp/jail-console-$name.log"; mount.devfs; path = /jails/$name; allow.raw_sockets; #securelevel = 2; exec.prestart = "logger trying to start jail $name..."; exec.poststart = "logger jail $name has started"; exec.prestop = "logger shutting down jail $name"; exec.poststop = "logger jail $name has shut down"; host.hostname = "$name.int.unixathome.org"; persist; # # end of standard settings for each jail # allow.chflags; allow.mount.devfs; allow.mount.fdescfs; allow.mount.linprocfs; allow.mount.nullfs; allow.mount.procfs; allow.mount.tmpfs; allow.mount.zfs=true; allow.mount=true; allow.raw_sockets; allow.socket_af; children.max=200; enforce_statfs=1; exec.created+="zfs jail $name data01/poudriere"; exec.created+="zfs set jailed=on data01/poudriere"; exec.poststart += "jail -m allow.mount.linprocfs=1 name=$name"; exec.poststop += "/usr/local/sbin/jib destroy $name"; exec.prestart += "/usr/local/sbin/jib addm $name igb0"; host.domainname=none; sysvmsg=new; sysvsem=new; sysvshm=new; vnet.interface = "e0b_$name"; vnet; }
Sp far, I’m stuck. I’ll come back to this later.
It took only a few minutes for someone else to spot the problem
Within minutes of posting, the AVAIL issue was pointed out. I was in denial. “But I could create a file from the host OS…” Looking back, that test was invalid. It wasn’t in the same filesystem.
It alway helps to show someone else your work when you’re stuck. Someone else can easily spot what you’ve missed because you’ve got blinkers on and are too close to the work.