Modifying a ZFS root system to a beadm layout

Today I sat down to install some packages on a server I just configured for Ansible and iocage. I failed. My poudriere host runs FreeBSD 9.3 and can build for older version of FreeBSD but not newer versions. Solution: upgrade that host.

I will be using beadm to allow me to keep my existing 9.3 and create a new 10.1 on the same host. I will be able to switch between the two environments by rebooting. If things go poorly, and they do, I always have 9.3 to go back to.

FYI, you can read more about the host I’m upgrading here.

NOTE: I use the term filesystem and dataset to refer to the same thing: something created via zfs create

The easy way

The easy way to use beadm is to first configure your system to have a beadm compatible layout. I have been setting up new hosts that way, but my older hosts do not have that layout. My install script does that.

Sadly, the host on which I want to use beadm is not laid out that way. However, it is zfs root, but not laid out for boot environments. Fortunately, it seems that the solution is a matter of moving things around, not reinstalling.

First, what zpool?

Before you do anything, find out what your zpool boots from. All operations are performed on my zpool, which is named system.

$ zpool list
NAME     SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
system  16.2T  6.72T  9.53T    41%  1.00x  ONLINE  -

Yes, the zpool is called system. Some people call it zroot, some call it sys, others call it rpool.

First step, install beadm

This was easy:

[dan@slocum:~] $ sudo pkg install beadm
Updating local repository catalogue...
Fetching meta.txz: 100%    844 B   0.8kB/s    00:01    
Fetching packagesite.txz: 100%  161 KiB 165.4kB/s    00:01    
Processing entries: 100%
local repository update completed. 644 packages processed
The following 1 packages will be affected (of 0 checked):

New packages to be INSTALLED:
	beadm: 1.1_1

The process will require 27 KiB more space.
8 KiB to be downloaded.

Proceed with this action? [y/N]: y
Fetching beadm-1.1_1.txz: 100%    8 KiB   9.0kB/s    00:01    
Checking integrity... done (0 conflicting)
[1/1] Installing beadm-1.1_1...
[1/1] Extracting beadm-1.1_1: 100%
[dan@slocum:~] $ 

Having beadm installed isn’t enough. You need a suitable layout. What is a suitable layout? I’m not sure. I found nothing which was explicit in this regard. The FreeBSD installer, bsdinstall does seem to cater for multiple boot environments. My brief searches failed to find any practical examples. They all seemed to assume the proper layout already existed.

This is what I encountered:

# beadm list
ERROR: This system does not boot from ZFS pool

I asked for help

When I first got the above message, I started tweeting about it. I asked Savagedlight if she knew the commands and she came up with this. I slightly altered it for my environment.

What commands? I will show you soon.

Booting from a live CD

These are non-trivial changes. They need to be done while the system is not running. Often, this is done while booting the system from a live CD or USB drive.

Back in the old days, I used to boot from FreeSBIE on a CD. Now I use a USB thumb drive.

To accomplish this magic, I booted from mfsBSD. I downloaded the 10.1-RELEASE-amd64 ISO. I burned it to a USB thumb drive with this command:

# dd if=mfsbsd-10.1-RELEASE-amd64.img of=/dev/daX bs=64k

You will have to figure out what daX is on your particular computer. When I inserted the USB thumb drive, I saw this on the console (and in /var/log/messages):

Mar  8 22:52:42 knew kernel: daX at umass-sim0 bus 0 scbus8 target 0 lun 0
Mar  8 22:52:42 knew kernel: daX: < USB DISK 2.0 PMAP> Removable Direct Access SCSI-0 device 
Mar  8 22:52:42 knew kernel: daX: Serial Number 07971C00AFAE0212
Mar  8 22:52:42 knew kernel: daX: 40.000MB/s transfers
Mar  8 22:52:42 knew kernel: daX: 1911MB (3913728 512 byte sectors: 255H 63S/T 243C)
Mar  8 22:52:42 knew kernel: daX: quirks=0x3

I booted the system from that thumb drive. From the console, I checked the IP address. Then I ssh‘d in as root.

ssh root@10.3.0.68

WHAAAAT?

Yes. I ssh‘d as root. The password is mfsRoot

This is a feature of mfsBSD.

By ssh‘ing to the system, it was a lot easier than working on the console. I could use my laptop, copy/paste, etc. Much nicer. Thank you Martin Matuška.

Getting access to the zpool

After I ssh‘d to the system which was booted off a live USB drive, I started to issue the commands in the list.

To understand why I’m issuing these particular commands, you should compare the before and after filesystem layouts.

In order to operate on the filesystems from your server, you will need to import the zpool. This is the command I used:

zpool import -f -o altroot=/mnt system

Some background:

  • -f forces the import. The zpool is from a different system. Without it, you’ll see see this message “cannot import ‘system’: pool may be in use from other system”. It can be dangerous to import a zpool, but in this case, you know it is not in use because you booted from the USB drive.
  • -o specifies the option altroot, which mounts the zpool under /mnt, so it does not conflict with any existing mount points.

If you do a zfs list, you should see everything from your zpool mounted under /mnt.

Reconfiguring the filesystems

The first commands created two new filesystems

zfs create -o mountpoint=none            system/bootenv
zfs create -o mountpoint=none            system/data

system/bootenv will become the base directory for the boot environments. The name is not critical. It can be called whatever you want. This filesystem will be the parent for all filesystems which will be tied to a particular boot environment. This will soon become clear.

system/data will hold filesystems which are not boot environment specific, e.g. PostgreSQL or /usr/ports.

I then issued the rest of the commands:

# stuff for default
zfs rename system/rootfs                 system/bootenv/default
zfs rename system/tmp                    system/bootenv/default/tmp
zfs rename system/usr                    system/bootenv/default/usr
zfs rename system/usr/obj                system/bootenv/default/usr/obj
zfs rename system/usr/src                system/bootenv/default/usr/src
zfs rename system/var                    system/bootenv/default/var

The key rename for my situation was the first one. My existing system was booting from system/rootfs. In case of doubt, this command shows you the answer:

# zpool get bootfs
NAME    PROPERTY  VALUE          SOURCE
system  bootfs    system/rootfs  local

The above proves that I am doing the correct rename.

NOTE: I could have used any name; it did not have to be default. I could have called it FreeBSD93, for example.

Some of these renames turned out to be unnecessary. I cannot recall the specifics now, but if memory serves, I did not have to rename system/usr/obj or system/usr/src because of inheritance; the rename of system/usr took care of that.

NOTE: I am moving /var under the boot environment. This may not suit your needs. Some people prefer to have the logs persist regardless of whatever boot environment is running.

Then I did a bunch more:

zfs rename system/usr/home               system/data/homes
zfs rename system/usr/local/pgsql        system/data/pgsql
zfs rename system/usr/ports              system/data/ports
zfs rename system/usr/ports/distfiles    system/data/ports/distfiles

zfs rename system/var/audit              system/bootenv/default/var/audit
zfs rename system/var/empty              system/bootenv/default/var/empty
zfs rename system/var/log                system/bootenv/default/var/log
zfs rename system/var/tmp                system/bootenv/default/var/tmp

The final set of commands changes the way things boot:

zfs inherit -r mountpoint                 system/bootenv
zfs set mountpoint=/                      system/bootenv

This list of changes is incomplete. I know I did more to change things around. If you look at the after image, you’ll notice them.

zfs inherit -r mountpoint                 system/data
zfs set mountpoint=/usr/home              system/data/homes
zfs set mountpoint=/usr/ports             system/data/ports
zfs set mountpoint=/usr/local/pgsql       system/data/pgsql

Examine the output of zfs list and make adjustments as required. If I detailed what I did here, I think it would confuse you more than it would help you.

cache

NOTE: I’m told this is no longer required on FreeBSD (at least for FreeBSD 11.x), but I’ve not confirmed that. Rumor has it, if you decided not to do this, the worst case is “boot from live environment, import the pool, create zpool.cache, export the pool”.

The cachefile property of a zpool “controls the location of where the pool configuration is cached. Discovering all pools on system startup requires a cached copy of the configuration data that is stored on the root file system”. See zpool(8) for more information on cachefile.

This is the magic formula I got from Allan Jude to help set my cachefile correctly:

  1. zfs umount -a
  2. zpool import -R /mnt system
  3. chroot /mnt
  4. mount -t devfs devfs /dev
  5. zpool set cachefile=/boot/zfs/zpool.cache
  6. exit

I will explain those steps:

  1. unmount your zpool because we’re going to do something different this time
  2. import the zpool, set the cachefile to none, and mount it at /mnt
  3. chroot into the mount directory
  4. mount devfs into the chroot
  5. set the cachefile based on what we have right now
  6. profit

This step is required because I have completely changed the layout of the filesystems.

Before you reboot!

By default, the dataset selected for booting is identified by the pool’s bootfs property. So before you boot, make sure you do something like this:

zpool set bootfs=system/bootenv/default system

In my case:

  • system/bootenv/default – As mentioned above, that’s where I was booting from before. Now that it’s been renamed, we have to adjust the bootfs property.
  • system – this is the name of my zpool.

Things to check before rebooting

Look in /boot/loader.conf for references to vfs.root.mountfrom because that will override the bootfs value specified in the previous step.

Look in /etc/fstab for references to your bootfs values. In my case, I found:

system/rootfs        /    zfs  rw,noatime 0 0

If I had looked here, it would have saved me nearly two hours of trying to figure out why the system was still booting from system/rootfs, and failing, instead of from system/bootenv/default.

Post reboot problems

After the boot there were a few problems to fix.

Changes to jails

On this server I originally had my jails installed at /usr/local/jails. After these changes, I moved them back to the standard /usr/jails location.

To do this, I had to alter files in:

  1. /etc/fstab.* – the paths in these files needed modification
  2. /usr/local/etc/ezjail – both the jail_JAILNAME_rootdir and jail_JAILNAME_parentzfs parameters required updates

Changes to zfs filesystem properties

After the changes, I started getting this error in a jail:

[dan@bacula:~] $ sudo ls
sudo: effective uid is not 0, is /usr/local/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?
[dan@bacula:~] $ 

This stumped me. None of the references I found mentioned anything relevant to this situation, but they did give me clues.

Eventually I found this zfs property:

[dan@slocum:~] $ zfs get all system/data/jails/bacula | grep -i id
system/data/jails/bacula  setuid                off                    inherited from system
system/data/jails/bacula  snapdir               hidden                 default

See there? setuid is off.

Let’s look at another filesystem, this one is part of the host

[dan@slocum:~] $ zfs get all system/bootenv/default/usr/local | grep -i id
system/bootenv/default/usr/local  setuid                on                     local
system/bootenv/default/usr/local  snapdir               hidden                 default

There, it’s on.

It used to be on everywhere that now appears under system/data, so here I turn it back on:

# zfs set setuid=on system/data

By this time, it was 11:30 pm and I went to sleep.

postfix/postdrop permission errors

The next morning I noticed these errors:

postfix/postdrop[4739]: warning: mail_queue_enter: create file maildrop/94555.4739: Permission denied

Everything I looked at referred to the postfix set-permissions command. That didn’t help. The permissions looked exactly like a server which did not have the problem.

Then, I guessed: restart the jail. Problem solved.

Conclusion: The setuid mentioned in the previous section did not take effect until after the jail was restarted.

poudriere

I didn’t notice this one for a while. Then I saw extra filesystems:

$ zfs list | grep poudriere
system/data/poudriere                       31.1G  6.18T   336K  /usr/local/poudriere
system/data/poudriere/data                  15.5G  6.18T  2.28G  /usr/local/poudriere/data
system/data/poudriere/data/cache             831M  6.18T   585M  /usr/local/poudriere/data/cache
system/data/poudriere/data/packages         10.1G  6.18T  4.38G  /usr/local/poudriere/data/packages
system/data/poudriere/jails                 7.00G  6.18T   448K  /usr/local/poudriere/jails
system/data/poudriere/jails/92amd64         1.64G  6.18T  1.64G  /usr/local/poudriere/jails/92amd64
system/data/poudriere/jails/92i386          1.49G  6.18T  1.49G  /usr/local/poudriere/jails/92i386
system/data/poudriere/jails/93amd64         1.17G  6.18T  1.13G  /usr/local/poudriere/jails/93amd64
system/data/poudriere/jails/93i386          1.11G  6.18T  1.08G  /usr/local/poudriere/jails/93i386
system/data/poudriere/ports                 8.66G  6.18T   320K  /usr/local/poudriere/ports
system/data/poudriere/ports/default         5.80G  6.18T  1.93G  /usr/local/poudriere/ports/default
system/data/poudriere/ports/testing         2.87G  6.18T  1.89G  /usr/local/poudriere/ports/testing
system/poudriere                            1.57G  6.18T   288K  /poudriere
system/poudriere/data                       1.57G  6.18T   400K  /usr/local/poudriere/data
system/poudriere/data/.m                     591K  6.18T   591K  /usr/local/poudriere/data/.m
system/poudriere/data/cache                 99.3M  6.18T  99.3M  /usr/local/poudriere/data/cache
system/poudriere/data/cronjob-logs           456K  6.18T   456K  /usr/local/poudriere/data/cronjob-logs
system/poudriere/data/logs                  88.3M  6.18T  88.3M  /usr/local/poudriere/data/logs
system/poudriere/data/packages              1.38G  6.18T  1.38G  /usr/local/poudriere/data/packages
system/poudriere/data/wrkdirs                288K  6.18T   288K  /usr/local/poudriere/data/wrkdirs

See those at the end? See how much smaller they are than the similarly named datasets at the top of the list?

Poudriere automagically created those datasets on its nightly runs from my crontab.

You’ll also see that some of them are mounted to the same location as the correctly named datasets.

Here is the change I made to use the new layout:

$ diff /usr/local/etc/poudriere.conf~  /usr/local/etc/poudriere.conf
19c19
< # ZROOTFS=/poudriere
---
> ZROOTFS=/data/poudriere

Then, just in case I needed it later, I renamed the system/poudriere* filesystems to system/zzz-DELETE-ME.poudriere* name and set mountpoint=legacy for them all.

Next step

The next step: upgrade this system to FreeBSD 10.1, and I’m really looking forward to doing this upgrade.

Thanks Marie Helene for helping with this filesystem modification so I can use beadm

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top