Today I was updating a FreeBSD server from 12.2 to 13.0 – I was using a new approach for my upgrades. This was my second host to upgrade like this. The first went smoothly. This one, not so much.
NOTE: this turned out to be insufficient because /usr was mounted:
[dan@slocum:~] $ zfs get canmount zroot/usr NAME PROPERTY VALUE SOURCE zroot/usr canmount on received
This system was manually converted, poorly, to a BE environment. /usr should look like this:
[dan@r720-01:~] $ zfs get canmount zroot/usr NAME PROPERTY VALUE SOURCE zroot/usr canmount off local
Comparing an upgraded host to another, I found that the same situation exists with zroot/var:
[dan@slocum:~] $ zfs get canmount zroot/var NAME PROPERTY VALUE SOURCE zroot/var canmount on received
That should be off, not on.
Fixing the slocum host will require booting off a live thumbdrive. That exercise is save for another day.
The rest of this remains to demonstrate the concept.
The problem arises
[dan@slocum:~] $ sudo bectl create -r 13 [dan@slocum:~] $ bectl list BE Active Mountpoint Space Created 13 - - 8K 2021-05-21 18:17 default NR / 282M 2020-06-20 13:47 [dan@slocum:~] $ sudo bectl mount 13 /var/tmp/BE-13 Successfully mounted 13 at /var/tmp/BE-13 [dan@slocum:~] $ sudo chroot /var/tmp/BE-13 root@slocum:/ # bash bash: Command not found.
Wait? What?
OK, what do we do here?
Let’s keep going:
root@slocum:/ # mount -t devfs devfs /dev root@slocum:/ # rm -rf /var/db/freebsd-update root@slocum:/ # mkdir /var/db/freebsd-update root@slocum:/ # freebsd-update upgrade -r 13.0-RELEASE Looking up update.FreeBSD.org mirrors... ld-elf.so.1: Shared object "libcrypto.so.7" not found, required by "host" none found. Fetching public key from update.FreeBSD.org... failed. No mirrors remaining, giving up. root@slocum:/ # exit
Why? What’s up?
[dan@slocum:~] $ zfs list -r zroot NAME USED AVAIL REFER MOUNTPOINT zroot 12.8G 193G 88K /zroot zroot/ROOT 282M 193G 88K none zroot/ROOT/13 188K 193G 281M / zroot/ROOT/default 282M 193G 281M / zroot/mkjail 728M 193G 96K /zroot/mkjail zroot/mkjail/12.2-RELEASE 728M 193G 728M /zroot/mkjail/12.2-RELEASE zroot/tmp 6.68M 193G 152K /tmp zroot/usr 2.26G 193G 1.42G /usr zroot/usr/local 857M 193G 677M /usr/local zroot/var 9.39G 193G 8.78G /var zroot/var/audit 88K 193G 88K /var/audit zroot/var/empty 88K 193G 88K /var/empty zroot/var/log 154M 193G 25.0M /var/log zroot/var/tmp 4.07M 193G 684K /var/tmp [dan@slocum:~] $
Oh. There’s my /usr/local on a different file system. Let’s correct that.
The plan:
- stop all the daemons etc
- zfs rename zroot/usr/local zroot/usr/old_local
- cp -Rp /usr/old_local /usr/local
Stopping stuff
First, jails:
[dan@slocum:~] $ sudo service jail stop Stopping jails: bacula bacula-sd-03 certs certs-rsync cliff2 dev-ingress01 dev-nginx01 devgit-ingress01 devgit-nginx01 dns-hidden-master fileserver git jail-testing librenms mobile-nginx01 mydev mysql01 mx-ingress01 nsnotify pg01 samdrucker serpico stage-nginx01 stagegit-ingress01 stagegit-nginx01 stage-ingress01 svn test-ingress01 test-nginx01 testgit-ingress01 testgit-nginx01 unifi01 webserver talos besser.
Let’s see what services are enabled:
[dan@slocum:~] $ service -e /etc/rc.d/hostid /etc/rc.d/zvol /etc/rc.d/hostid_save /etc/rc.d/zfsbe /etc/rc.d/zfs /etc/rc.d/cleanvar /etc/rc.d/kldxref /etc/rc.d/devmatch /etc/rc.d/ip6addrctl /etc/rc.d/netif /etc/rc.d/devd /etc/rc.d/pflog /etc/rc.d/pf /etc/rc.d/resolv /etc/rc.d/newsyslog /etc/rc.d/syslogd /etc/rc.d/nfsclient /usr/local/etc/rc.d/microcode_update /usr/local/etc/rc.d/named /etc/rc.d/savecore /etc/rc.d/dmesg /etc/rc.d/virecover /usr/local/etc/rc.d/wireguard /usr/local/etc/rc.d/snmpd /etc/rc.d/motd /etc/rc.d/ntpd /etc/rc.d/rctl /usr/local/etc/rc.d/isc-dhcpd /usr/local/etc/rc.d/nut /usr/local/etc/rc.d/nut_upsmon /usr/local/etc/rc.d/postfix /usr/local/etc/rc.d/smartd /usr/local/etc/rc.d/nrpe3 /usr/local/etc/rc.d/bsdstats /usr/local/etc/rc.d/bacula-fd /etc/rc.d/sshd /etc/rc.d/cron /etc/rc.d/jail /etc/rc.d/mixer /etc/rc.d/gptboot /etc/rc.d/bgfsck [dan@slocum:~] $
Let’s stop all the things under /usr/local/etc/rc.d:
[dan@slocum:~] $ sudo service isc-dhcpd stop Stopping dhcpd. Waiting for PIDS: 79115. [dan@slocum:~] $ sudo service nut stop Stopping nut. Waiting for PIDS: 2829. Network UPS Tools - UPS driver controller 2.7.4 Broadcast Message from root@slocum.int.unixathome.org "slocum.int.unixathome" 18:44 21-May-21 (no tty) at 18:44 UTC... Communications with UPS heartbeat lost Broadcast Message from root@slocum.int.unixathome.org (no tty) at 18:44 UTC... Communications with UPS heartbeat lost [dan@slocum:~] $ sudo service nut_upsmon stop Network UPS Tools upsmon 2.7.4 [dan@slocum:~] $ sudo service postfix stop postfix/postfix-script: stopping the Postfix mail system [dan@slocum:~] $ sudo service smartd stop Stopping smartd. Waiting for PIDS: 2917. [dan@slocum:~] $ sudo service nrpe3 stop Stopping nrpe3. Waiting for PIDS: 2922. [dan@slocum:~] $ sudo service bsdstats stop bsdstats not running? [dan@slocum:~] $ sudo service bacula-fd stop Stopping bacula_fd. Waiting for PIDS: 4885. [dan@slocum:~] $
A few more I see:
[dan@slocum:~] $ sudo service snmpd stop Stopping snmpd. Waiting for PIDS: 2746. [dan@slocum:~] $ sudo service named stop Stopping named. Waiting for PIDS: 2653. [dan@slocum:~] $
That’s everything I see running which isn’t needed. Definitely keep sshd going.
Become root
My sudo will go away when I do this, so I must become root first.
[dan@slocum:~] $ su Password: # bash [root@slocum:/usr/home/dan] #
Renaming
I try, fail, then force.
[root@slocum:/usr/home/dan] # zfs rename zroot/usr/local zroot/usr/old_local cannot unmount '/usr/local': Device busy [root@slocum:/usr/home/dan] # cd [root@slocum:~] # zfs rename -f zroot/usr/local zroot/usr/old_local Segmentation fault (core dumped) [0] 0:bash* $ "slocum.int.unixathome" 18:51 21-May-21 $
Good thing I did that su. That created two .core files:
[dan@slocum:~] $ ls -lt total 26501286 -rw------- 1 dan dan 9949184 May 21 18:51 bash.core -rw------- 1 dan dan 9920512 May 21 18:51 tmux.core
Let’s start again, but no tmux this time.
What do we have now?
This is what are copying to:
# cd /usr/local # ls -lat total 11 drwxr-xr-x 17 root wheel 17 May 21 18:51 .. drwxr-xr-x 2 root wheel 2 Aug 25 2015 jails drwxr-xr-x 5 root wheel 5 Mar 11 2015 . drwxr-xr-x 2 root wheel 2 Mar 11 2015 pgsql drwxr-xr-x 2 root wheel 2 Mar 11 2015 poudriere
Those are all zfs mountpoints.
What do we have to copy?
# cd /usr/old_local # ls -l total 313 drwxr-xr-x 3 root wheel 607 May 17 22:00 bin drwxr-xr-x 44 root wheel 161 May 17 22:00 etc drwxr-xr-x 89 root wheel 237 May 17 22:00 include drwxr-xr-x 2 root wheel 3 Dec 12 2018 info drwxr-xr-x 3 root wheel 3 Apr 29 2014 jailaudit drwxr-xr-x 32 root wheel 902 May 17 22:00 lib drwxr-xr-x 5 root wheel 5 Dec 27 2016 libdata drwxr-xr-x 10 root wheel 20 May 17 22:00 libexec drwxr-xr-x 40 root wheel 42 May 15 04:19 man -rw-r--r-- 1 root wheel 943 Mar 13 2015 my-new.cnf -rw-r--r-- 1 root wheel 976 Feb 21 2017 my.cnf drwxr-xr-x 2 root wheel 4 Mar 26 11:14 openssl drwxr-xr-x 3 root wheel 3 Mar 14 2019 poudriere drwxr-xr-x 2 root wheel 103 May 21 18:15 sbin drwxr-xr-x 70 root wheel 70 Apr 19 15:41 share drwxr-xr-x 2 root wheel 2 Mar 13 2014 tests drwxr-xr-x 4 root wheel 4 Jun 17 2018 var drwxr-xr-x 3 root wheel 3 Aug 14 2019 www
I see some overlaps here: poudriere
What’s in there?
# ls poudriere/ jails # ls -l poudriere/jails/ total 4 drwxr-xr-x 2 root wheel 2 Jun 10 2014 92amd64 drwxr-xr-x 2 root wheel 2 May 29 2014 92i386 drwxr-xr-x 2 root wheel 2 Nov 21 2013 bacula164 drwxr-xr-x 2 root wheel 2 Nov 21 2013 bacula165 drwxr-xr-x 2 root wheel 2 Nov 21 2013 bacula167 drwxr-xr-x 2 root wheel 2 Nov 21 2013 bacula168 drwxr-xr-x 2 root wheel 2 Nov 21 2013 bacula169 drwxr-xr-x 2 root wheel 2 Dec 24 2013 bsdcan #
All very old stuff, which was hidden by the mountpoint. And all directories:
# cd poudriere/jails/ # rmdir * # cd .. # rmdir jails #
This is not a lot of data to copy:
# du -ch -d 1 . 1.5K ./www 331M ./lib 40M ./sbin 7.9M ./etc 176M ./share 1.1M ./libdata 474K ./openssl 20M ./libexec 55M ./bin 13K ./poudriere 2.5K ./var 26M ./include 43K ./jailaudit 27M ./man 5.0K ./info 512B ./tests 684M . 684M total #
Copy everything over
root@slocum:/usr/old_local # cp -Rp * /usr/local/ root@slocum:/usr/old_local #
After the copy
After the copy:
root@slocum:/usr/old_local # bash ld-elf.so.1: Shared object "libdl.so.1" not found, required by "bash"
Looking on a good host, I see:
[dan@knew:~] $ ldd /usr/local/bin/bash /usr/local/bin/bash: libncursesw.so.8 => /lib/libncursesw.so.8 (0x800369000) libintl.so.8 => /usr/local/lib/libintl.so.8 (0x8003cb000) libdl.so.1 => /usr/lib/libdl.so.1 (0x8003da000) libc.so.7 => /lib/libc.so.7 (0x8003de000)
Let’s look in /usr/lib:
root@slocum:/usr/lib # ls -lt | head total 5578 drwxr-xr-x 2 root wheel 14 Aug 25 2015 private lrwxr-xr-x 1 root wheel 18 Aug 12 2015 libbsnmptools.so -> libbsnmptools.so.0 -r--r--r-- 1 root wheel 71376 Aug 12 2015 libbsnmptools.so.0 lrwxr-xr-x 1 root wheel 13 Aug 12 2015 snmp_atm.so -> snmp_atm.so.6 -r--r--r-- 1 root wheel 29376 Aug 12 2015 snmp_atm.so.6 lrwxr-xr-x 1 root wheel 16 Aug 12 2015 snmp_bridge.so -> snmp_bridge.so.6 -r--r--r-- 1 root wheel 137072 Aug 12 2015 snmp_bridge.so.6 lrwxr-xr-x 1 root wheel 14 Aug 12 2015 snmp_hast.so -> snmp_hast.so.6 -r--r--r-- 1 root wheel 115064 Aug 12 2015 snmp_hast.so.6 root@slocum:/usr/lib #
Everything from here is from 2015. That’s not right. Another case of a mountpoint being hidden?
Still in the chroot
chroot. I should have been doing this on the host system.I opened another ssh session. Everything looked fine there.
Fix up the old filesystem
Let’s not have this mounted automatically:
[dan@slocum:~] $ sudo zfs set canmount=off zroot/usr/old_local [dan@slocum:~] $ zfs get mounted zroot/usr/old_local NAME PROPERTY VALUE SOURCE zroot/usr/old_local mounted no -
Start fresh
[dan@slocum:~] $ sudo bectl umount 13 [dan@slocum:~] $ sudo bectl destroy 13 [dan@slocum:~] $
We now return you to our regularly scheduled upgrade.