May 212021
 

Today I was updating a FreeBSD server from 12.2 to 13.0 – I was using a new approach for my upgrades. This was my second host to upgrade like this. The first went smoothly. This one, not so much.

NOTE: this turned out to be insufficient because /usr was mounted:

[dan@slocum:~] $ zfs get canmount zroot/usr
NAME       PROPERTY  VALUE     SOURCE
zroot/usr  canmount  on        received

This system was manually converted, poorly, to a BE environment. /usr should look like this:

[dan@r720-01:~] $ zfs get canmount zroot/usr
NAME       PROPERTY  VALUE     SOURCE
zroot/usr  canmount  off       local

Comparing an upgraded host to another, I found that the same situation exists with zroot/var:

[dan@slocum:~] $ zfs get canmount zroot/var
NAME       PROPERTY  VALUE     SOURCE
zroot/var  canmount  on        received

That should be off, not on.

Fixing the slocum host will require booting off a live thumbdrive. That exercise is save for another day.

The rest of this remains to demonstrate the concept.

The problem arises

[dan@slocum:~] $ sudo bectl create -r 13
[dan@slocum:~] $ bectl list
BE      Active Mountpoint Space Created
13      -      -          8K    2021-05-21 18:17
default NR     /          282M  2020-06-20 13:47
[dan@slocum:~] $ sudo bectl mount 13 /var/tmp/BE-13
Successfully mounted 13 at /var/tmp/BE-13
[dan@slocum:~] $ sudo chroot /var/tmp/BE-13
root@slocum:/ # bash
bash: Command not found.

Wait? What?

OK, what do we do here?

Let’s keep going:

root@slocum:/ # mount -t devfs devfs /dev
root@slocum:/ # rm -rf /var/db/freebsd-update
root@slocum:/ # mkdir /var/db/freebsd-update
root@slocum:/ # freebsd-update upgrade -r 13.0-RELEASE
Looking up update.FreeBSD.org mirrors... ld-elf.so.1: Shared object "libcrypto.so.7" not found, required by "host"
none found.
Fetching public key from update.FreeBSD.org... failed.
No mirrors remaining, giving up.
root@slocum:/ # exit

Why? What’s up?

[dan@slocum:~] $ zfs list -r zroot
NAME                        USED  AVAIL  REFER  MOUNTPOINT
zroot                      12.8G   193G    88K  /zroot
zroot/ROOT                  282M   193G    88K  none
zroot/ROOT/13               188K   193G   281M  /
zroot/ROOT/default          282M   193G   281M  /
zroot/mkjail                728M   193G    96K  /zroot/mkjail
zroot/mkjail/12.2-RELEASE   728M   193G   728M  /zroot/mkjail/12.2-RELEASE
zroot/tmp                  6.68M   193G   152K  /tmp
zroot/usr                  2.26G   193G  1.42G  /usr
zroot/usr/local             857M   193G   677M  /usr/local
zroot/var                  9.39G   193G  8.78G  /var
zroot/var/audit              88K   193G    88K  /var/audit
zroot/var/empty              88K   193G    88K  /var/empty
zroot/var/log               154M   193G  25.0M  /var/log
zroot/var/tmp              4.07M   193G   684K  /var/tmp
[dan@slocum:~] $ 

Oh. There’s my /usr/local on a different file system. Let’s correct that.

The plan:

  1. stop all the daemons etc
  2. zfs rename zroot/usr/local zroot/usr/old_local
  3. cp -Rp /usr/old_local /usr/local

Stopping stuff

First, jails:

[dan@slocum:~] $ sudo service jail stop
Stopping jails: bacula bacula-sd-03 certs certs-rsync cliff2 dev-ingress01 dev-nginx01 devgit-ingress01 
devgit-nginx01 dns-hidden-master fileserver git jail-testing librenms mobile-nginx01 mydev mysql01 mx-ingress01 
nsnotify pg01 samdrucker serpico stage-nginx01 stagegit-ingress01 stagegit-nginx01 stage-ingress01 svn 
test-ingress01 test-nginx01 testgit-ingress01 testgit-nginx01 unifi01 webserver talos besser.

Let’s see what services are enabled:

[dan@slocum:~] $ service -e
/etc/rc.d/hostid
/etc/rc.d/zvol
/etc/rc.d/hostid_save
/etc/rc.d/zfsbe
/etc/rc.d/zfs
/etc/rc.d/cleanvar
/etc/rc.d/kldxref
/etc/rc.d/devmatch
/etc/rc.d/ip6addrctl
/etc/rc.d/netif
/etc/rc.d/devd
/etc/rc.d/pflog
/etc/rc.d/pf
/etc/rc.d/resolv
/etc/rc.d/newsyslog
/etc/rc.d/syslogd
/etc/rc.d/nfsclient
/usr/local/etc/rc.d/microcode_update
/usr/local/etc/rc.d/named
/etc/rc.d/savecore
/etc/rc.d/dmesg
/etc/rc.d/virecover
/usr/local/etc/rc.d/wireguard
/usr/local/etc/rc.d/snmpd
/etc/rc.d/motd
/etc/rc.d/ntpd
/etc/rc.d/rctl
/usr/local/etc/rc.d/isc-dhcpd
/usr/local/etc/rc.d/nut
/usr/local/etc/rc.d/nut_upsmon
/usr/local/etc/rc.d/postfix
/usr/local/etc/rc.d/smartd
/usr/local/etc/rc.d/nrpe3
/usr/local/etc/rc.d/bsdstats
/usr/local/etc/rc.d/bacula-fd
/etc/rc.d/sshd
/etc/rc.d/cron
/etc/rc.d/jail
/etc/rc.d/mixer
/etc/rc.d/gptboot
/etc/rc.d/bgfsck
[dan@slocum:~] $ 

Let’s stop all the things under /usr/local/etc/rc.d:

[dan@slocum:~] $ sudo service isc-dhcpd stop
Stopping dhcpd.
Waiting for PIDS: 79115.
[dan@slocum:~] $ sudo service nut stop
Stopping nut.
Waiting for PIDS: 2829.
Network UPS Tools - UPS driver controller 2.7.4
                                                                               
Broadcast Message from root@slocum.int.unixathome.org                                                                                        "slocum.int.unixathome" 18:44 21-May-21
        (no tty) at 18:44 UTC...                                               
                                                                               
Communications with UPS heartbeat lost                                         
                                                                               
Broadcast Message from root@slocum.int.unixathome.org                          
        (no tty) at 18:44 UTC...                                               
                                                                               
Communications with UPS heartbeat lost                                         
                                                                               
[dan@slocum:~] $ sudo service nut_upsmon stop
Network UPS Tools upsmon 2.7.4
[dan@slocum:~] $ sudo service postfix stop
postfix/postfix-script: stopping the Postfix mail system
[dan@slocum:~] $ sudo service smartd stop
Stopping smartd.
Waiting for PIDS: 2917.
[dan@slocum:~] $ sudo service nrpe3 stop
Stopping nrpe3.
Waiting for PIDS: 2922.
[dan@slocum:~] $ sudo service bsdstats stop
bsdstats not running?
[dan@slocum:~] $ sudo service bacula-fd stop
Stopping bacula_fd.
Waiting for PIDS: 4885.
[dan@slocum:~] $ 

A few more I see:

[dan@slocum:~] $ sudo service snmpd stop
Stopping snmpd.
Waiting for PIDS: 2746.
[dan@slocum:~] $ sudo service named stop
Stopping named.
Waiting for PIDS: 2653.
[dan@slocum:~] $ 

That’s everything I see running which isn’t needed. Definitely keep sshd going.

Become root

My sudo will go away when I do this, so I must become root first.

[dan@slocum:~] $ su
Password:
# bash
[root@slocum:/usr/home/dan] # 

Renaming

I try, fail, then force.

[root@slocum:/usr/home/dan] # zfs rename zroot/usr/local zroot/usr/old_local
cannot unmount '/usr/local': Device busy

[root@slocum:/usr/home/dan] # cd 

[root@slocum:~] # zfs rename -f zroot/usr/local zroot/usr/old_local
Segmentation fault (core dumped)
[0] 0:bash*                     $                                                                                                            "slocum.int.unixathome" 18:51 21-May-21
$ 

Good thing I did that su. That created two .core files:

[dan@slocum:~] $ ls -lt
total 26501286
-rw-------   1 dan   dan      9949184 May 21 18:51 bash.core
-rw-------   1 dan   dan      9920512 May 21 18:51 tmux.core

Let’s start again, but no tmux this time.

What do we have now?

This is what are copying to:

# cd /usr/local
# ls -lat
total 11
drwxr-xr-x  17 root  wheel  17 May 21 18:51 ..
drwxr-xr-x   2 root  wheel   2 Aug 25  2015 jails
drwxr-xr-x   5 root  wheel   5 Mar 11  2015 .
drwxr-xr-x   2 root  wheel   2 Mar 11  2015 pgsql
drwxr-xr-x   2 root  wheel   2 Mar 11  2015 poudriere

Those are all zfs mountpoints.

What do we have to copy?

# cd /usr/old_local
# ls -l
total 313
drwxr-xr-x   3 root  wheel  607 May 17 22:00 bin
drwxr-xr-x  44 root  wheel  161 May 17 22:00 etc
drwxr-xr-x  89 root  wheel  237 May 17 22:00 include
drwxr-xr-x   2 root  wheel    3 Dec 12  2018 info
drwxr-xr-x   3 root  wheel    3 Apr 29  2014 jailaudit
drwxr-xr-x  32 root  wheel  902 May 17 22:00 lib
drwxr-xr-x   5 root  wheel    5 Dec 27  2016 libdata
drwxr-xr-x  10 root  wheel   20 May 17 22:00 libexec
drwxr-xr-x  40 root  wheel   42 May 15 04:19 man
-rw-r--r--   1 root  wheel  943 Mar 13  2015 my-new.cnf
-rw-r--r--   1 root  wheel  976 Feb 21  2017 my.cnf
drwxr-xr-x   2 root  wheel    4 Mar 26 11:14 openssl
drwxr-xr-x   3 root  wheel    3 Mar 14  2019 poudriere
drwxr-xr-x   2 root  wheel  103 May 21 18:15 sbin
drwxr-xr-x  70 root  wheel   70 Apr 19 15:41 share
drwxr-xr-x   2 root  wheel    2 Mar 13  2014 tests
drwxr-xr-x   4 root  wheel    4 Jun 17  2018 var
drwxr-xr-x   3 root  wheel    3 Aug 14  2019 www

I see some overlaps here: poudriere

What’s in there?

# ls poudriere/
jails
# ls -l poudriere/jails/
total 4
drwxr-xr-x  2 root  wheel  2 Jun 10  2014 92amd64
drwxr-xr-x  2 root  wheel  2 May 29  2014 92i386
drwxr-xr-x  2 root  wheel  2 Nov 21  2013 bacula164
drwxr-xr-x  2 root  wheel  2 Nov 21  2013 bacula165
drwxr-xr-x  2 root  wheel  2 Nov 21  2013 bacula167
drwxr-xr-x  2 root  wheel  2 Nov 21  2013 bacula168
drwxr-xr-x  2 root  wheel  2 Nov 21  2013 bacula169
drwxr-xr-x  2 root  wheel  2 Dec 24  2013 bsdcan
# 

All very old stuff, which was hidden by the mountpoint. And all directories:

# cd poudriere/jails/
# rmdir *
# cd ..
# rmdir jails
# 

This is not a lot of data to copy:

# du -ch -d 1 .
1.5K	./www
331M	./lib
 40M	./sbin
7.9M	./etc
176M	./share
1.1M	./libdata
474K	./openssl
 20M	./libexec
 55M	./bin
 13K	./poudriere
2.5K	./var
 26M	./include
 43K	./jailaudit
 27M	./man
5.0K	./info
512B	./tests
684M	.
684M	total
# 

Copy everything over

root@slocum:/usr/old_local # cp -Rp * /usr/local/
root@slocum:/usr/old_local #  

After the copy

After the copy:

root@slocum:/usr/old_local # bash
ld-elf.so.1: Shared object "libdl.so.1" not found, required by "bash"

Looking on a good host, I see:

[dan@knew:~] $ ldd /usr/local/bin/bash
/usr/local/bin/bash:
	libncursesw.so.8 => /lib/libncursesw.so.8 (0x800369000)
	libintl.so.8 => /usr/local/lib/libintl.so.8 (0x8003cb000)
	libdl.so.1 => /usr/lib/libdl.so.1 (0x8003da000)
	libc.so.7 => /lib/libc.so.7 (0x8003de000)

Let’s look in /usr/lib:

root@slocum:/usr/lib # ls -lt | head
total 5578
drwxr-xr-x  2 root  wheel      14 Aug 25  2015 private
lrwxr-xr-x  1 root  wheel      18 Aug 12  2015 libbsnmptools.so -> libbsnmptools.so.0
-r--r--r--  1 root  wheel   71376 Aug 12  2015 libbsnmptools.so.0
lrwxr-xr-x  1 root  wheel      13 Aug 12  2015 snmp_atm.so -> snmp_atm.so.6
-r--r--r--  1 root  wheel   29376 Aug 12  2015 snmp_atm.so.6
lrwxr-xr-x  1 root  wheel      16 Aug 12  2015 snmp_bridge.so -> snmp_bridge.so.6
-r--r--r--  1 root  wheel  137072 Aug 12  2015 snmp_bridge.so.6
lrwxr-xr-x  1 root  wheel      14 Aug 12  2015 snmp_hast.so -> snmp_hast.so.6
-r--r--r--  1 root  wheel  115064 Aug 12  2015 snmp_hast.so.6
root@slocum:/usr/lib # 

Everything from here is from 2015. That’s not right. Another case of a mountpoint being hidden?

Still in the chroot

chroot. I should have been doing this on the host system.

I opened another ssh session. Everything looked fine there.

Fix up the old filesystem

Let’s not have this mounted automatically:

[dan@slocum:~] $ sudo zfs set canmount=off zroot/usr/old_local
[dan@slocum:~] $ zfs get mounted zroot/usr/old_local
NAME                 PROPERTY  VALUE    SOURCE
zroot/usr/old_local  mounted   no       -

Start fresh

[dan@slocum:~] $ sudo bectl umount 13
[dan@slocum:~] $ sudo bectl destroy 13
[dan@slocum:~] $ 

We now return you to our regularly scheduled upgrade.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive