In this post, I am working with FreeBSD 10.2 on this server.
Over the past few days, a 3 year old drive has been giving errors. The number of errors has been constant, but I have a spare drive here, so I decided to replace it. I have verified that it is out of warranty.
Rather than pull the error drive and replace it, I opted to add in a new drive and let ZFS resilver it first. This is the safest option because you can do the resilver knowing there is full redundancy in place while you do that.
TIP: Note the serial number of the new drive before you add it to the server.
Before you proceed, if the new drive has been previously used in a zpool, I suggest trying my preparation ideas first. They may help you avoid the issues I will encounter with gmirror and ZFS labels. See the end of this post for details.
Identifying the drives
After reboot, the new drive is ada0. The dying drive is ada1. I confirmed this with via smartctl. I also knew the serial number (Z2T4KGYASTZ6) of the drive I just inserted. Yes, that drive is also out of warranty, but relatively unused.
NOTE: I may have thought I knew the number of the drive, but I was wrong. See the output of smartctl below. The serial number I mention above is wrong.
This is the new drive, and by new, I mean it has 1 year of power on time. I have no idea what this drive may have been used for in the the past, but I am confident I am the original owner.
[dan@knew:~] $ sudo smartctl -a /dev/ada0 smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.2-RELEASE-p9 amd64] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Toshiba 3.5" DT01ACA... Desktop HDD Device Model: TOSHIBA DT01ACA300 Serial Number: Z2T3BAXAS LU WWN Device Id: 5 000039 ff4c187ba Firmware Version: MX6OABB0 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Fri Feb 19 15:51:27 2016 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (24373) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 407) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 139 139 054 Pre-fail Offline - 71 3 Spin_Up_Time 0x0007 139 139 024 Pre-fail Always - 420 (Average 405) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 38 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 124 124 020 Pre-fail Offline - 33 9 Power_On_Hours 0x0012 099 099 000 Old_age Always - 9642 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 38 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 59 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 59 194 Temperature_Celsius 0x0002 181 181 000 Old_age Always - 33 (Min/Max 15/44) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. [dan@knew:~] $
The gmirror issue
The first issue: I did not wipe this drive when starting, thus, gmirror tried to start up and use it. It failed.
$ gmirror list Geom name: swap State: DEGRADED Components: 3 Balance: load Slice: 4096 Flags: NONE GenID: 0 SyncID: 1 ID: 2857317765 Providers: 1. Name: mirror/swap Mediasize: 2147483136 (2.0G) Sectorsize: 512 Stripesize: 4096 Stripeoffset: 0 Mode: r1w1e1 Consumers: 1. Name: ada0p2 Mediasize: 2147483648 (2.0G) Sectorsize: 512 Stripesize: 4096 Stripeoffset: 0 Mode: r1w1e1 State: ACTIVE Priority: 1 Flags: NONE GenID: 0 SyncID: 1 ID: 2717712298
So I stopped and destroy that gmirror:
$ sudo gmirror stop -f swap $ sudo gmirror destroy swap
Destroy and rebuild
Here is how I rebuild the disk to look like the others.
This first command is destructive. Issue it on the correct drive.
$ sudo gpart destroy -F ada0 ada0 destroyed $ sudo gpart create -s gpt ada0 ada0 created $ sudo gpart add -b 34 -s 94 -t freebsd-boot -l disk_Z2T4KGYASTZ6 ada0 ada0p1 added $ sudo gpart add -s 8g -t freebsd-swap -l swap6a ada0 ada0p2 added $ sudo gpart add -t freebsd-zfs -s 2784G -l disk_Z2T4KGYASTZ6data ada0 ada0p3 added
The labels (i.e. the -l parameter) I used are derived from:
- disk_Z2T4KGYASTZ6 – the disk serial number
- swap6a – I know this new drive will replace gpt/disk6. I learned that via the output of glabel status
- disk_Z2T4KGYASTZ6data – again, the disk serial number, with data appended to make it unique.
Labels are completely arbitrary and you can use whatever label you prefer.
Those handy commands, all in one place, are:
sudo gpart create -s gpt ada0 sudo gpart add -b 34 -s 94 -t freebsd-boot -l disk_Z2T4KGYASTZ6 ada0 sudo gpart add -s 8g -t freebsd-swap ada0 sudo gpart add -t freebsd-zfs -s 2784G -l disk_Z2T4KGYASTZ6data ada0
Where did I get those numbers? From the disk I am replacing.
$ gpart show ada0 ada1 => 34 5860533101 ada0 GPT (2.7T) 34 6 - free - (3.0K) 40 88 1 freebsd-boot (44K) 128 16777216 2 freebsd-swap (8.0G) 16777344 5838471168 3 freebsd-zfs (2.7T) 5855248512 5284623 - free - (2.5G) => 34 5860533101 ada1 GPT (2.7T) 34 94 1 freebsd-boot (47K) 128 16777216 2 freebsd-swap (8.0G) 16777344 5838471168 3 freebsd-zfs (2.7T) 5855248512 5284623 - free - (2.5G)
If you take 5838471168 / 1024 / 1024 / 2, you get 2784G. I’m converting from bytes to GB, and dividing by 2 because these are 512 byte sectors.
Or to save time and ensure accuracy, you can copy the gpart information from one drive to another.
[dan@knew:~] $ gpart backup ada1 | sudo gpart restore ada0 [dan@knew:~] $ gpart show ada0 ada1 => 34 5860533101 ada0 GPT (2.7T) 34 94 1 freebsd-boot (47K) 128 16777216 2 freebsd-swap (8.0G) 16777344 5838471168 3 freebsd-zfs (2.7T) 5855248512 5284623 - free - (2.5G) => 34 5860533101 ada1 GPT (2.7T) 34 94 1 freebsd-boot (47K) 128 16777216 2 freebsd-swap (8.0G) 16777344 5838471168 3 freebsd-zfs (2.7T) 5855248512 5284623 - free - (2.5G) [dan@knew:~] $
The trouble with this: no labels. I used the previous method.
Replacing the drive – failed attempt
From man zpool(8), using zpool replace is a good option here. I have room for spare drive in the system. I just have to tell zpool to replace drive A with drive B.
$ sudo zpool replace system gpt/disk6 gpt/disk_Z2T4KGYASTZ6 cannot replace gpt/disk6 with gpt/disk_Z2T4KGYASTZ6: device is too small $ gpart show ada0 ada1 => 34 5860533101 ada0 GPT (2.7T) 34 94 1 freebsd-boot (47K) 128 16777216 2 freebsd-swap (8.0G) 16777344 5838471168 3 freebsd-zfs (2.7T) 5855248512 5284623 - free - (2.5G) => 34 5860533101 ada1 GPT (2.7T) 34 94 1 freebsd-boot (47K) 128 16777216 2 freebsd-swap (8.0G) 16777344 5838471168 3 freebsd-zfs (2.7T) 5855248512 5284623 - free - (2.5G)
Wait. They are the same size.
Ahh, I’m using the wrong device, as Allan Jude pointed out (on IRC).
OK, this time, I got a very different message, when I appended data to the device.
$ sudo zpool replace system gpt/disk_Z2T4KGYASTZ6data invalid vdev specification use '-f' to override the following errors: /dev/gpt/disk_Z2T4KGYASTZ6data is part of potentially active pool 'system'
This is zpool trying to save you from accidentally placing a disk from one pool into another. Remember how I said this disk has been previously used? Well, this is that ghost coming back to haunt me, again. In this case, it is the lingering ZFS label information from previous usage.
Searching for that message, I found I had encountered it 6 years earlier. That old post led me to run this command:
$ sudo zdb -l /dev/gpt/disk_Z2T4KGYASTZ6data -------------------------------------------- LABEL 0 -------------------------------------------- failed to unpack label 0 -------------------------------------------- LABEL 1 -------------------------------------------- failed to unpack label 1 -------------------------------------------- LABEL 2 -------------------------------------------- version: 5000 name: 'system' state: 0 txg: 6915674 pool_guid: 11353391169725922550 hostid: 3600270990 hostname: '' top_guid: 5976699353168341310 guid: 2491056538036708260 vdev_children: 1 vdev_tree: type: 'mirror' id: 0 guid: 5976699353168341310 metaslab_array: 30 metaslab_shift: 34 ashift: 12 asize: 2995734970368 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 12249913144658353338 path: '/dev/gpt/disk0' phys_path: '/dev/gpt/disk0' whole_disk: 1 DTL: 146 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 2491056538036708260 path: '/dev/gpt/disk1' phys_path: '/dev/gpt/disk1' whole_disk: 1 DTL: 145 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 2449415577503190274 path: '/dev/gpt/disk2' phys_path: '/dev/gpt/disk2' whole_disk: 1 DTL: 144 create_txg: 4 features_for_read: com.delphix:hole_birth -------------------------------------------- LABEL 3 -------------------------------------------- version: 5000 name: 'system' state: 0 txg: 6915674 pool_guid: 11353391169725922550 hostid: 3600270990 hostname: '' top_guid: 5976699353168341310 guid: 2491056538036708260 vdev_children: 1 vdev_tree: type: 'mirror' id: 0 guid: 5976699353168341310 metaslab_array: 30 metaslab_shift: 34 ashift: 12 asize: 2995734970368 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 12249913144658353338 path: '/dev/gpt/disk0' phys_path: '/dev/gpt/disk0' whole_disk: 1 DTL: 146 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 2491056538036708260 path: '/dev/gpt/disk1' phys_path: '/dev/gpt/disk1' whole_disk: 1 DTL: 145 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 2449415577503190274 path: '/dev/gpt/disk2' phys_path: '/dev/gpt/disk2' whole_disk: 1 DTL: 144 create_txg: 4 features_for_read: com.delphix:hole_birth [dan@knew:~] $
To clear that out, you can use this command:
$ sudo zpool labelclear -f /dev/gpt/disk_Z2T4KGYASTZ6data
This confirms the labels are gone:
$ sudo zdb -l /dev/gpt/disk_Z2T4KGYASTZ6data -------------------------------------------- LABEL 0 -------------------------------------------- failed to unpack label 0 -------------------------------------------- LABEL 1 -------------------------------------------- failed to unpack label 1 -------------------------------------------- LABEL 2 -------------------------------------------- failed to unpack label 2 -------------------------------------------- LABEL 3 -------------------------------------------- failed to unpack label 3
Replacing the drive – successful attempt
After beating back the ghosts of ZFS-past, I issued the replace command again:
$ sudo zpool replace system gpt/disk6 gpt/disk_Z2T4KGYASTZ6data Make sure to wait until resilver is done before rebooting. If you boot from pool 'system', you may need to update boot code on newly attached disk 'gpt/disk_Z2T4KGYASTZ6data'. Assuming you use GPT partitioning and 'da0' is your new boot disk you may use the following command: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0
That last recommendation is important. I will do that.
$ sudo gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0 bootcode written to ada0
How are things looking?
$ zpool status pool: system state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Fri Feb 19 03:44:25 2016 369M scanned out of 19.8T at 5.85M/s, (scan is slow, no estimated time) 32.1M resilvered, 0.00% done config: NAME STATE READ WRITE CKSUM system ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/disk0 ONLINE 0 0 0 gpt/disk1 ONLINE 0 0 0 gpt/disk2 ONLINE 0 0 0 gpt/disk3 ONLINE 0 0 0 gpt/disk4 ONLINE 0 0 0 gpt/disk5 ONLINE 0 0 0 replacing-6 ONLINE 0 0 0 gpt/disk6 ONLINE 0 0 0 gpt/disk_Z2T4KGYASTZ6data ONLINE 0 0 0 (resilvering) gpt/disk7 ONLINE 0 0 0 gpt/disk8 ONLINE 0 0 0 gpt/disk9 ONLINE 0 0 0 errors: No known data errors
This was at 10:45 PM
The morning after
At 6:38 AM the next day:
zpool status pool: system state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Fri Feb 19 03:44:25 2016 2.46T scanned out of 19.8T at 90.9M/s, 55h37m to go 240G resilvered, 12.42% done config: NAME STATE READ WRITE CKSUM system ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/disk0 ONLINE 0 0 0 gpt/disk1 ONLINE 0 0 0 gpt/disk2 ONLINE 0 0 0 gpt/disk3 ONLINE 0 0 0 gpt/disk4 ONLINE 0 0 0 gpt/disk5 ONLINE 0 0 0 replacing-6 ONLINE 0 0 0 gpt/disk6 ONLINE 0 0 0 gpt/disk_Z2T4KGYASTZ6data ONLINE 0 0 0 (resilvering) gpt/disk7 ONLINE 0 0 0 gpt/disk8 ONLINE 0 0 0 gpt/disk9 ONLINE 0 0 0 errors: No known data errors
By about noon, it was at:
scan: resilver in progress since Fri Feb 19 03:44:25 2016 5.28T scanned out of 19.8T at 108M/s, 39h9m to go 514G resilvered, 26.62% done
At 9 AM the following day:
scan: resilver in progress since Fri Feb 19 03:44:25 2016 14.3T scanned out of 19.8T at 122M/s, 13h9m to go 1.36T resilvered, 72.33% done
Sunday
On Sunday morning, I found:
$ zpool status pool: system state: ONLINE scan: resilvered 1.88T in 51h0m with 0 errors on Sun Feb 21 06:44:49 2016 config: NAME STATE READ WRITE CKSUM system ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/disk0 ONLINE 0 0 0 gpt/disk1 ONLINE 0 0 0 gpt/disk2 ONLINE 0 0 0 gpt/disk3 ONLINE 0 0 0 gpt/disk4 ONLINE 0 0 0 gpt/disk5 ONLINE 0 0 0 gpt/disk_Z2T4KGYASTZ6data ONLINE 0 0 0 gpt/disk7 ONLINE 0 0 0 gpt/disk8 ONLINE 0 0 0 gpt/disk9 ONLINE 0 0 0
Relabelling
I will try relabelling to make sure the label matches the correct serial number. Watch this space for an update.
Disk prep
If I was doing this again, I would prepare the disk. I should do that to all my disks which have been used, and are waiting to be reused. It saves time and avoids confusion when it is best to have clear thoughts.
Be careful with these commands. Issue them on the correct drive.
I would:
- wipe the labels – sudo zpool labelclear -f /dev/gpt/disk_Z2T4KGYASTZ6data
- clear the partions – sudo gpart destroy -F ada0