Dec 142020
 

You might recall that suspect drive
from the zpool replace on the weekend. Thomas Hurst suggested:

Might be worth overwriting the drive, try to encourage it to actually reallocate the sectors now the data on them is no longer needed.

I, being one to take advice from people on the internet, and Michael W Lucas, decided to try his suggestion.

The drive in question.

[dan@knew:~] $ tail /var/log/messages
Dec 14 00:00:00 knew newsyslog[88570]: logfile turned over
Dec 14 00:23:04 knew smartd[2124]: Device: /dev/da17 [SAT], 40 Currently unreadable (pending) sectors
Dec 14 00:53:04 knew syslogd: last message repeated 1 times
Dec 14 01:23:03 knew syslogd: last message repeated 1 times
[dan@knew:~] $ gpart show da17
=>        34  9767541101  da17  GPT  (4.5T)
          34           6        - free -  (3.0K)
          40  9766000000     1  freebsd-zfs  (4.5T)
  9766000040     1541095        - free -  (752M)

I start, by being sure geli is loaded.

[dan@knew:~] $ kldload geom_eli
kldload: can't load geom_eli: Operation not permitted
[dan@knew:~] $ sudo kldload geom_eli
kldload: can't load geom_eli: module already loaded or in kernel

Why geli? Because Thomas suggested it.

OK, so let’s follow the documentation, in part. I omitted the -K /root/da2.key paramter.

[dan@knew:~] $ sudo geli init -s 4096 /dev/da17p1
Enter new passphrase: 
Reenter new passphrase: 

Metadata backup for provider /dev/da17p1 can be found in /var/backups/da17p1.eli
and can be restored with the following command:

        # geli restore /var/backups/da17p1.eli /dev/da17p1

[dan@knew:~] $ 

Then I started the dd:

[dan@knew:~] $ sudo dd if=/dev/zero of=/dev/da17p1 bs=1M 
^C1172+0 records in
1171+0 records out
1227882496 bytes transferred in 6.263732 secs (196030500 bytes/sec)

I control-C’d it so I could add time to the front of the command so I could see how long it took.

Well, there’s the time, right there, in the output, without using time.

OK, let’s try this again:

[dan@knew:~] $ sudo dd if=/dev/zero of=/dev/da17p1 bs=1M

Some time later

Here is what I found the next morning:

[dan@knew:~] $ sudo dd if=/dev/zero of=/dev/da17p1 bs=1M
dd: /dev/da17p1: short write on character device
dd: /dev/da17p1: end of device
4768555+0 records in
4768554+1 records out
5000192000000 bytes transferred in 29778.616710 secs (167912165 bytes/sec)
[dan@knew:~] $ 

Why geli? It’s not all zeros, and the theory is that’s faster than /dev/random .

Let’s look at the logs

[dan@knew:~] $ tail /var/log/messages
Dec 14 05:23:04 knew syslogd: last message repeated 1 times
Dec 14 05:53:04 knew syslogd: last message repeated 1 times
Dec 14 06:23:04 knew smartd[2124]: Device: /dev/da17 [SAT], 40 Currently unreadable (pending) sectors
Dec 14 06:53:04 knew syslogd: last message repeated 1 times
Dec 14 07:23:03 knew syslogd: last message repeated 1 times
Dec 14 07:53:04 knew syslogd: last message repeated 1 times
Dec 14 08:23:03 knew smartd[2124]: Device: /dev/da17 [SAT], 40 Currently unreadable (pending) sectors
Dec 14 08:53:04 knew syslogd: last message repeated 1 times
Dec 14 09:23:03 knew syslogd: last message repeated 1 times
Dec 14 09:53:04 knew syslogd: last message repeated 1 times
[dan@knew:~] $ 

[dan@knew:~] $ date
Mon Dec 14 14:33:08 UTC 2020

There have been no smartd messages in the past 4.5 hours

EDIT: make that the past 15.5 hours:

[dan@knew:~] $ tail -2  /var/log/messages.0
Dec 14 09:53:04 knew syslogd: last message repeated 1 times
Dec 15 00:00:00 knew newsyslog[67573]: logfile turned over
[dan@knew:~] $ 
[dan@knew:~] $ tail /var/log/messages
Dec 15 00:00:00 knew newsyslog[67573]: logfile turned over

Let’s do a diff on the before and after smartctl output

[dan@pro02:~/tmp] $ diff -ruN before after
--- before	2020-12-14 09:39:39.000000000 -0500
+++ after	2020-12-14 09:37:33.000000000 -0500
@@ -3,22 +3,22 @@
   2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
   3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       9291
   4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       138
-  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       12448
+  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       12504
   7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
   8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
-  9 Power_On_Hours          0x0032   001   001   000    Old_age   Always       -       43747
+  9 Power_On_Hours          0x0032   001   001   000    Old_age   Always       -       43795
  10 Spin_Retry_Count        0x0033   102   100   030    Pre-fail  Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       138
-191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       5680
+191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       5681
 192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       129
 193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       741
 194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       33 (Min/Max 15/51)
-196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       1435
-197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       40
+196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       1438
+197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
 198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
 199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       7
 220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       0
-222 Loaded_Hours            0x0032   001   001   000    Old_age   Always       -       43578
+222 Loaded_Hours            0x0032   001   001   000    Old_age   Always       -       43626
 223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
 226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       203
[dan@pro02:~/tmp] $ 

From the above:

  1. Reallocated_Sector_Ct has increased by 56 from 12448 to 12504
  2. Current_Pending_Sector has dropped from 40 to 0

So… opinons on this drive now?

Well, Thomas said:

When I said “geli onetime”, I literally meant the “geli onetime” command, which keeps keys in RAM for one-time use, such as for encrypted swap partitions.

Also you completely bypassed it there by not using the .eli device :)

Finally – wow, that’s a *lot* of reallocated sectors.

And Dag-Erling Smørgrav said:

dd if=/dev/zero would have sufficed

So there you go.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive