3Ware device degraded

On supernews a drive acted up over night.

The main purpose of this post is for me to record the information. You might not find much useful here.

The host is running FreeBSD 12 and is a FreshPorts development box.

I saw this error in the logs:

Aug  8 03:17:15 supernews smartd[66288]: Device: /dev/twa0 [3ware_disk_00], ATA error count increased from 22 to 23

smart emailed me because I set that up some time ago.

The email looked like this:

This message was generated by the smartd daemon running on:

   host name:  supernews
   DNS domain: example.org

The following warning/error was logged by the smartd daemon:

Device: /dev/twa0 [3ware_disk_00], ATA error count increased from 22 to 23

Device info:
WDC WD740GD-00FLC0, S/N:WD-WMAKE2379003, FW:33.08F33, 74.3 GB

For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
No additional messages about this problem will be sent.

I logged in and did some looking. I found the above mentioned entry in /var/log/messages

For the record, here is some of the stuff I saw:

[dan@supernews:~] $ sudo /usr/local/sbin/tw_cli info c0 u0

Unit     UnitType  Status         %RCmpl  %V/I/M  Port  Stripe  Size(GB)
------------------------------------------------------------------------
u0       RAID-10   DEGRADED*      -       -       -     64K     195.548   
u0-0     RAID-1    DEGRADED       -       -       -     -       -         
u0-0-0   DISK      DEGRADED       -       -       p0    -       65.1826   
u0-0-1   DISK      OK             -       -       p2    -       65.1826   
u0-1     RAID-1    OK             -       -       -     -       -         
u0-1-0   DISK      OK             -       -       p6    -       65.1826   
u0-1-1   DISK      OK             -       -       p5    -       65.1826   
u0-2     RAID-1    OK             -       -       -     -       -         
u0-2-0   DISK      OK             -       -       p3    -       65.1826   
u0-2-1   DISK      OK             -       -       p4    -       65.1826   
u0/v0    Volume    -              -       -       -     -       195.548   

[dan@supernews:~] $ sudo tw_cli info c0

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   DEGRADED       -       -       64K     195.548   OFF    ON     
u1    SPARE     OK             -       -       -       69.2404   -      ON     
u2    SPARE     OK             -       -       -       69.2404   -      OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     DEVICE-ERROR     u0     69.25 GB    145226112     WD-WMAKE2379003     
p1     OK               u1     69.25 GB    145226112     WD-WMAKE2379069     
p2     OK               u0     69.25 GB    145226112     WD-WMAKE2379066     
p3     OK               u0     69.25 GB    145226112     WD-WMAKE2379012     
p4     OK               u0     69.25 GB    145226112     WD-WMAKE2379286     
p5     OK               u0     69.25 GB    145226112     WD-WMAKE2379019     
p6     OK               u0     69.25 GB    145226112     WD-WMAKE2394339     
p7     OK               u2     69.25 GB    145226112     WD-WMAKE2378696     

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       255    20-Nov-2017  

[dan@supernews:~] $ sudo tw_cli info c0 u0

Unit     UnitType  Status         %RCmpl  %V/I/M  Port  Stripe  Size(GB)
------------------------------------------------------------------------
u0       RAID-10   DEGRADED*      -       -       -     64K     195.548   
u0-0     RAID-1    DEGRADED       -       -       -     -       -         
u0-0-0   DISK      DEGRADED       -       -       p0    -       65.1826   
u0-0-1   DISK      OK             -       -       p2    -       65.1826   
u0-1     RAID-1    OK             -       -       -     -       -         
u0-1-0   DISK      OK             -       -       p6    -       65.1826   
u0-1-1   DISK      OK             -       -       p5    -       65.1826   
u0-2     RAID-1    OK             -       -       -     -       -         
u0-2-0   DISK      OK             -       -       p3    -       65.1826   
u0-2-1   DISK      OK             -       -       p4    -       65.1826   
u0/v0    Volume    -              -       -       -     -       195.548   

[dan@supernews:~] $ sudo /usr/local/sbin/tw_cli
//supernews> show all
Error: (CLI:041) Invalid shell command.

//supernews> /c0 show all
/c0 Driver Version = 3.80.06.003
/c0 Model = 9550SX-8LP
/c0 Available Memory = 112MB
/c0 Firmware Version = FE9X 3.08.00.029
/c0 Bios Version = BE9X 3.10.00.003
/c0 Boot Loader Version = BL9X 3.01.00.006
/c0 Serial Number = L20805B5500320
/c0 PCB Version = Rev 032
/c0 PCHIP Version = 1.60
/c0 ACHIP Version = 1.70
/c0 Number of Ports = 8
/c0 Number of Drives = 8
/c0 Number of Units = 3
/c0 Total Optimal Units = 2
/c0 Not Optimal Units = 1 
/c0 JBOD Export Policy = off
/c0 Disk Spinup Policy = 1
/c0 Spinup Stagger Time Policy (sec) = 1
/c0 Auto-Carving Policy = off
/c0 Auto-Carving Size = 2048 GB
/c0 Auto-Rebuild Policy = on
/c0 Rebuild Rate = 4
/c0 Verify Rate = 1
/c0 Controller Bus Type = PCI
/c0 Controller Bus Width = 64 bits
/c0 Controller Bus Speed = 66 Mhz

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    RAID-10   DEGRADED       -       -       64K     195.548   OFF    ON     
u1    SPARE     OK             -       -       -       69.2404   -      ON     
u2    SPARE     OK             -       -       -       69.2404   -      OFF    

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     DEVICE-ERROR     u0     69.25 GB    145226112     WD-WMAKE2379003     
p1     OK               u1     69.25 GB    145226112     WD-WMAKE2379069     
p2     OK               u0     69.25 GB    145226112     WD-WMAKE2379066     
p3     OK               u0     69.25 GB    145226112     WD-WMAKE2379012     
p4     OK               u0     69.25 GB    145226112     WD-WMAKE2379286     
p5     OK               u0     69.25 GB    145226112     WD-WMAKE2379019     
p6     OK               u0     69.25 GB    145226112     WD-WMAKE2394339     
p7     OK               u2     69.25 GB    145226112     WD-WMAKE2378696     

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       255    20-Nov-2017  

I didn’t try to figure out how to start the rebuild, so I rebooted the server. Yeah, hackish.

If you know what I should have done, please let me know.

After the reboot

After the reboot:

[dan@supernews:~] $ uptime
10:21PM  up 2 mins, 1 user, load averages: 1.06, 0.36, 0.14
[dan@supernews:~] $ sudsudo tw_cli info c0 u0

Unit     UnitType  Status         %RCmpl  %V/I/M  Port  Stripe  Size(GB)
------------------------------------------------------------------------
u0       RAID-10   REBUILDING     68%     -       -     64K     195.548   
u0-0     RAID-1    REBUILDING     4%      -       -     -       -         
u0-0-0   DISK      DEGRADED       -       -       p1    -       65.1826   
u0-0-1   DISK      OK             -       -       p2    -       65.1826   
u0-1     RAID-1    OK             -       -       -     -       -         
u0-1-0   DISK      OK             -       -       p6    -       65.1826   
u0-1-1   DISK      OK             -       -       p5    -       65.1826   
u0-2     RAID-1    OK             -       -       -     -       -         
u0-2-0   DISK      OK             -       -       p3    -       65.1826   
u0-2-1   DISK      OK             -       -       p4    -       65.1826   
u0/v0    Volume    -              -       -       -     -       195.548   

By the time I’d finished typing all of the above:

[dan@supernews:~] $ sudo tw_cli info c0 u0

Unit     UnitType  Status         %RCmpl  %V/I/M  Port  Stripe  Size(GB)
------------------------------------------------------------------------
u0       RAID-10   REBUILDING     78%     -       -     64K     195.548   
u0-0     RAID-1    REBUILDING     35%     -       -     -       -         
u0-0-0   DISK      DEGRADED       -       -       p1    -       65.1826   
u0-0-1   DISK      OK             -       -       p2    -       65.1826   
u0-1     RAID-1    OK             -       -       -     -       -         
u0-1-0   DISK      OK             -       -       p6    -       65.1826   
u0-1-1   DISK      OK             -       -       p5    -       65.1826   
u0-2     RAID-1    OK             -       -       -     -       -         
u0-2-0   DISK      OK             -       -       p3    -       65.1826   
u0-2-1   DISK      OK             -       -       p4    -       65.1826   
u0/v0    Volume    -              -       -       -     -       195.548   

It took about 30 minutes to complete the rebuild:

[dan@supernews:~] $ sudo tw_cli info c0 u0

Unit     UnitType  Status         %RCmpl  %V/I/M  Port  Stripe  Size(GB)
------------------------------------------------------------------------
u0       RAID-10   REBUILDING     78%     -       -     64K     195.548   
u0-0     RAID-1    REBUILDING     35%     -       -     -       -         
u0-0-0   DISK      DEGRADED       -       -       p1    -       65.1826   
u0-0-1   DISK      OK             -       -       p2    -       65.1826   
u0-1     RAID-1    OK             -       -       -     -       -         
u0-1-0   DISK      OK             -       -       p6    -       65.1826   
u0-1-1   DISK      OK             -       -       p5    -       65.1826   
u0-2     RAID-1    OK             -       -       -     -       -         
u0-2-0   DISK      OK             -       -       p3    -       65.1826   
u0-2-1   DISK      OK             -       -       p4    -       65.1826   
u0/v0    Volume    -              -       -       -     -       195.548   

[dan@supernews:~] $ uptime
10:47PM  up 29 mins, 2 users, load averages: 0.41, 0.37, 0.35

I notice that /var/log/messages put it at about 27 minutes:

Aug  8 22:20:22 supernews kernel: twa0: INFO: (0x04: 0x000B): Rebuild started: unit=0, subunit=0
Aug  8 22:47:40 supernews kernel: twa0: INFO: (0x04: 0x0005): Rebuild completed: unit=0, subunit=0

But wait! There’s more!

When the above was completed, I went to Nagios and told it to recheck the faulted items. They came back clean, but a new one appeared: VERIFYING.

Checking the logs again, I found:

Aug  8 22:49:22 supernews kernel: twa0: INFO: (0x04: 0x0029): Verify started: unit=0, subunit=0
Aug  8 22:49:22 supernews kernel: twa0: INFO: (0x04: 0x0029): Verify started: unit=0, subunit=1
Aug  8 22:49:22 supernews kernel: twa0: INFO: (0x04: 0x0029): Verify started: unit=0, subunit=2

The current status is:

[dan@supernews:~] $ sudo tw_cli info c0 u0

Unit     UnitType  Status         %RCmpl  %V/I/M  Port  Stripe  Size(GB)
------------------------------------------------------------------------
u0       RAID-10   VERIFYING      -       15%     -     64K     195.548   
u0-0     RAID-1    VERIFYING      15%     -       -     -       -         
u0-0-0   DISK      OK             -       -       p1    -       65.1826   
u0-0-1   DISK      OK             -       -       p2    -       65.1826   
u0-1     RAID-1    VERIFYING      15%     -       -     -       -         
u0-1-0   DISK      OK             -       -       p6    -       65.1826   
u0-1-1   DISK      OK             -       -       p5    -       65.1826   
u0-2     RAID-1    VERIFYING      15%     -       -     -       -         
u0-2-0   DISK      OK             -       -       p3    -       65.1826   
u0-2-1   DISK      OK             -       -       p4    -       65.1826   
u0/v0    Volume    -              -       -       -     -       195.548   

[dan@supernews:~] $ 

I think this is similar to a zpool scrub.

Good night

Now it is time for beer and pizza. It’s Thursday night.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top