Yesterday I did some scavenging of some servers which I’m going to dispose of. I managed to put together 4 x 1TB storage devices: 2 x NVMe sticks and 2 x SSDs. I also pulled a riser card from an R730 and relocated it to another host.
The NVMe sticks are mounted on these PCIe cards, which I do not regret buying. They came with full height and low-profile brackets. More to the point: I saved them and need them to get them into the riser card mentioned above.
I first installed those devices into r730-03, and then thought better of it and moved them to r730-01.
Along the way, I also moved a fiber card into the same server.
Partitions
Since starting with FreeBSD, I’ve always used partitioned devices with ZFS. This is official procedure on FreeBSD. However, as recently asked in a PR, why? It is what we know and I suspect the reasons why are lore. FreeBSD recommends partitioned devices. OpenZFS recommends whole devices. I wonder right now: is this GEOM related?
Still. I’m using partitions, for the reason mentioned in the first link in the previous paragraph: device size may vary from one model to another, despite being the same size.
Case in point, the devices are:
nvd0: <Samsung SSD 980 PRO with Heatsink 1TB> NVMe namespace nvd0: 953869MB (1953525168 512 byte sectors) nvd1: <Samsung SSD 980 PRO with Heatsink 1TB> NVMe namespace nvd1: 953869MB (1953525168 512 byte sectors) da12 at mrsas0 bus 1 scbus1 target 12 lun 0 da12: <ATA Samsung SSD 860 2B6Q> Fixed Direct Access SPC-4 SCSI device da12: Serial Number [redacted] da12: 150.000MB/s transfers da12: 953869MB (1953525168 512 byte sectors) da13 at mrsas0 bus 1 scbus1 target 13 lun 0 da13: <ATA Samsung SSD 860 2B6Q> Fixed Direct Access SPC-4 SCSI device da13: Serial Number [redacted] da13: 150.000MB/s transfers da13: 953869MB (1953525168 512 byte sectors)
All four devices report the same size (953869MB = 0.94TB)
These devices were previously used in zpools in other hosts.
label clearing
Clearing the zpool labels avoids confusion later. Let’s do that. This will remove the labels used by the old zpools.
I’ve already done this for two of the devices, but I am now going to do it for all four of the above devices.
[17:30 r730-01 dvl ~] % gpart show nvd0 nvd1 da12 da13 => 40 1953525088 nvd0 GPT (932G) 40 1953484800 1 freebsd-zfs (931G) 1953484840 40288 - free - (20M) => 40 1953525088 nvd1 GPT (932G) 40 1953484800 1 freebsd-zfs (931G) 1953484840 40288 - free - (20M) => 40 1953525088 da12 GPT (932G) 40 1953525088 1 freebsd-zfs (932G) => 40 1953525088 da13 GPT (932G) 40 1953525088 1 freebsd-zfs (932G) [17:29 r730-01 dvl ~] % sudo zpool labelclear /dev/nvd0p1 use '-f' to override the following error: /dev/nvd0p1 is a member of potentially active pool "nvd" [17:29 r730-01 dvl ~] % sudo zpool labelclear -f /dev/nvd0p1 [17:29 r730-01 dvl ~] % sudo zpool labelclear /dev/nvd1p1 use '-f' to override the following error: /dev/nvd1p1 is a member of potentially active pool "nvd" [17:29 r730-01 dvl ~] % sudo zpool labelclear -f /dev/nvd1p1 [17:30 r730-01 dvl ~] % sudo zpool labelclear /dev/da12p1 failed to clear label for /dev/da12p1 [17:30 r730-01 dvl ~] % sudo zpool labelclear /dev/da13p1 failed to clear label for /dev/da13p1
In the above, you can see the partitions on da12 and da13 are whole-disk. I have not left space at the end, like I recommend.
Let’s fix that.
[17:30 r730-01 dvl ~] % sudo gpart delete -i 1 da12 da12p1 deleted [17:33 r730-01 dvl ~] % sudo gpart delete -i 1 da13 da13p1 deleted [17:33 r730-01 dvl ~] % sudo gpart add -t freebsd-zfs -s 1953484800 da12 da12p1 added [17:33 r730-01 dvl ~] % sudo gpart add -t freebsd-zfs -s 1953484800 da13 da13p1 added [17:33 r730-01 dvl ~] % gpart show nvd0 nvd1 da12 da13 => 40 1953525088 nvd0 GPT (932G) 40 1953484800 1 freebsd-zfs (931G) 1953484840 40288 - free - (20M) => 40 1953525088 nvd1 GPT (932G) 40 1953484800 1 freebsd-zfs (931G) 1953484840 40288 - free - (20M) => 40 1953525088 da12 GPT (932G) 40 1953484800 1 freebsd-zfs (931G) 1953484840 40288 - free - (20M) => 40 1953525088 da13 GPT (932G) 40 1953484800 1 freebsd-zfs (931G) 1953484840 40288 - free - (20M) [17:33 r730-01 dvl ~] %
Good. Now we are good to go.
Creating the zpool
I plan to mirror an NVMe device with a SSS device. As opposed to two NVMe devices and two SSDs.
Why? Diversity. The theory being two equal devices are slightly more likely to suffer the same failure close in time to another. I’d like to see stats on this approach.
Here is what I’m starting with:
[17:33 r730-01 dvl ~] % zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT data01 5.81T 3.81T 2.00T - - 2% 65% 1.00x ONLINE - data02 1.73T 1.06T 692G - - 40% 61% 1.00x ONLINE - data03 7.25T 1.62T 5.63T - - 25% 22% 1.00x ONLINE - zroot 424G 34.8G 389G - - 9% 8% 1.00x ONLINE -
Here is the creation:
[17:37 r730-01 dvl ~] % sudo zpool create data04 mirror /dev/nvd0p1 /dev/da12p1 mirror /dev/nvd1p1 /dev/da13p1
That says:
- create a zpool named data04
- That zpool consists of two mirrors
- each mirror consists of
- an NVMe device
- an SSD device
This arrangement is sometimes referred to as a stripe over mirrors.
This is what I have now:
[17:37 r730-01 dvl ~] % zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT data01 5.81T 3.81T 2.00T - - 2% 65% 1.00x ONLINE - data02 1.73T 1.06T 692G - - 40% 61% 1.00x ONLINE - data03 7.25T 1.62T 5.63T - - 25% 22% 1.00x ONLINE - data04 1.81T 396K 1.81T - - 0% 0% 1.00x ONLINE - zroot 424G 34.8G 389G - - 9% 8% 1.00x ONLINE - [17:37 r730-01 dvl ~] % zpool status data04 pool: data04 state: ONLINE config: NAME STATE READ WRITE CKSUM data04 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 nvd0p1 ONLINE 0 0 0 da12p1 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 nvd1p1 ONLINE 0 0 0 da13p1 ONLINE 0 0 0 errors: No known data errors [17:37 r730-01 dvl ~] %
Labels
Sometimes folks add labels to the partitions. I’m going to do that now. This can also be specified during the partition creation. I forgot.
[17:46 r730-01 dvl ~] % sudo gpart modify -i 1 -l foo da12 da12p1 modified [17:47 r730-01 dvl ~] % sudo gpart modify -i 1 -l bar da13 da13p1 modified
You can see those labels with this command:
[17:48 r730-01 dvl ~] % gpart show -l nvd0 nvd1 da12 da13 => 40 1953525088 nvd0 GPT (932G) 40 1953484800 1 [redacted] (931G) 1953484840 40288 - free - (20M) => 40 1953525088 nvd1 GPT (932G) 40 1953484800 1 [redacted] (931G) 1953484840 40288 - free - (20M) => 40 1953525088 da12 GPT (932G) 40 1953484800 1 foo (931G) 1953484840 40288 - free - (20M) => 40 1953525088 da13 GPT (932G) 40 1953484800 1 bar (931G) 1953484840 40288 - free - (20M)
Those devices can also be seen by label in /dev/gpt – and I could have used those labels when creating the zpool. (e.g. instead of /dev/da12p1, I could have used /dev/gpt/foo).
Next: start using it. That will be another day.