mps0: IOC Fault 0x40007e23, Resetting

Here I am, sitting on a beach, writing a blog post, and sipping a cool adult beverage. Reading email.

I see this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Aug 14 08:49:25 knew kernel: mps0: IOC Fault 0x40007e23, Resetting
Aug 14 08:49:25 knew kernel: mps0: Reinitializing controller
Aug 14 08:49:25 knew kernel: mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd
Aug 14 08:49:25 knew kernel: mps0: IOCCapabilities: 5a85c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,MSIXIndex,HostDisc>
Aug 14 08:49:25 knew kernel: (da10:mps0:0:20:0): Invalidating pack
Aug 14 08:49:25 knew kernel: da10 at mps0 bus 0 scbus0 target 20 lun 0
Aug 14 08:49:25 knew kernel: da10: <ATA TOSHIBA HDWE150 FP2A>  s/n 4728K24SF57D detached
Aug 14 08:49:25 knew kernel: GEOM_MIRROR: Device swap: provider da10p2 disconnected.
Aug 14 08:49:25 knew ZFS[2544]: vdev I/O failure, zpool=system path=/dev/da10p3 offset=270336 size=8192 error=6
Aug 14 08:49:33 knew kernel: (da10:mps0:0:20:0): Periph destroyed
Aug 14 08:49:33 knew ZFS[2576]: vdev state changed, pool_guid=15378250086669402288 vdev_guid=233954150417046622
Aug 14 08:49:33 knew ZFS[2580]: vdev is removed, pool_guid=15378250086669402288 vdev_guid=233954150417046622
Aug 14 08:49:33 knew kernel: da10 at mps0 bus 0 scbus0 target 20 lun 0
Aug 14 08:49:33 knew kernel: da10: <ATA TOSHIBA HDWE150 FP2A> Fixed Direct Access SPC-4 SCSI device
Aug 14 08:49:33 knew kernel: da10: Serial Number 4728K24SF57D
Aug 14 08:49:33 knew kernel: da10: 600.000MB/s transfers
Aug 14 08:49:33 knew kernel: da10: Command Queueing enabled
Aug 14 08:49:33 knew kernel: da10: 4769307MB (9767541168 512 byte sectors)
Aug 14 08:49:34 knew ZFS[2623]: vdev state changed, pool_guid=15378250086669402288 vdev_guid=233954150417046622

I quickly ssh into the host to check zpool status:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
[knew dan ~] % zpool status
  pool: nvd
 state: ONLINE
  scan: scrub repaired 0B in 00:08:55 with 0 errors on Wed Aug 10 05:09:42 2022
config:
 
    NAME        STATE     READ WRITE CKSUM
    nvd         ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        nvd0p1  ONLINE       0     0     0
        nvd1p1  ONLINE       0     0     0
 
errors: No known data errors
 
  pool: system
 state: ONLINE
  scan: resilvered 56K in 00:00:05 with 0 errors on Sun Aug 14 08:49:48 2022
config:
 
    NAME        STATE     READ WRITE CKSUM
    system      ONLINE       0     0     0
      raidz2-0  ONLINE       0     0     0
        da3p3   ONLINE       0     0     0
        da10p3  ONLINE       0     0     0
        da9p3   ONLINE       0     0     0
        da2p3   ONLINE       0     0     0
        da13p3  ONLINE       0     0     0
        da15p3  ONLINE       0     0     0
        da11p3  ONLINE       0     0     0
        da14p3  ONLINE       0     0     0
        da8p3   ONLINE       0     0     0
        da7p3   ONLINE       0     0     0
      raidz2-1  ONLINE       0     0     0
        da5p1   ONLINE       0     0     0
        da6p1   ONLINE       0     0     0
        da19p1  ONLINE       0     0     0
        da12p1  ONLINE       0     0     0
        da4p1   ONLINE       0     0     0
        da1p1   ONLINE       0     0     0
        da22p1  ONLINE       0     0     0
        da16p1  ONLINE       0     0     0
        da0p1   ONLINE       0     0     0
        da18p1  ONLINE       0     0     0
 
errors: No known data errors
 
  pool: tank_fast01
 state: ONLINE
  scan: scrub repaired 0B in 00:09:00 with 0 errors on Wed Aug 10 05:10:05 2022
config:
 
    NAME                             STATE     READ WRITE CKSUM
    tank_fast01                      ONLINE       0     0     0
      mirror-0                       ONLINE       0     0     0
        gpt/S3Z8NB0KB11776R.Slot.11  ONLINE       0     0     0
        gpt/S3Z8NB0KB11784L.Slot.05  ONLINE       0     0     0
 
errors: No known data errors
 
  pool: zroot
 state: ONLINE
  scan: scrub repaired 0B in 00:00:37 with 0 errors on Wed Aug 10 05:01:43 2022
config:
 
    NAME        STATE     READ WRITE CKSUM
    zroot       ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        ada1p3  ONLINE       0     0     0
        ada0p3  ONLINE       0     0     0
 
errors: No known data errors

Lines 15-17 are relevant. There was a resilver event, which completed at 08:49:48

The vdev state changed event occurred at 08:49:34

That all seems to tie in, time-wise.

More info than you want

This displays vdev guids:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[knew dan ~] % zpool status -g system
  pool: system
 state: ONLINE
  scan: resilvered 56K in 00:00:05 with 0 errors on Sun Aug 14 08:49:48 2022
config:
 
    NAME                      STATE     READ WRITE CKSUM
    system                    ONLINE       0     0     0
      17787792673755622491    ONLINE       0     0     0
        15077920823230281604  ONLINE       0     0     0
        233954150417046622    ONLINE       0     0     0
        15344441343903378304  ONLINE       0     0     0
        15656418522176711912  ONLINE       0     0     0
        5265466717725104926   ONLINE       0     0     0
        16216204204940261481  ONLINE       0     0     0
        15196254092467021631  ONLINE       0     0     0
        892331977375855894    ONLINE       0     0     0
        16797368702065798832  ONLINE       0     0     0
        5993655369518912555   ONLINE       0     0     0
      9085889268805187753     ONLINE       0     0     0
        5892227802261634203   ONLINE       0     0     0
        9332658639709199239   ONLINE       0     0     0
        250004220145174872    ONLINE       0     0     0
        6216472763074854678   ONLINE       0     0     0
        12795310201775582855  ONLINE       0     0     0
        13315402097660581553  ONLINE       0     0     0
        18428760864140250121  ONLINE       0     0     0
        13603535286907309607  ONLINE       0     0     0
        4677401754715191854   ONLINE       0     0     0
        1933292688604201684   ONLINE       0     0     0
 
errors: No known data errors
[knew dan ~] %

Line 11 shows the same vdev guid as the log entries.

Here is the zpool guid:

[knew dan ~] % zpool get guid system
NAME    PROPERTY  VALUE  SOURCE
system  guid      15378250086669402288  -
[knew dan ~] % 

That matches the pool_guid from the logs.

My concern: why did this happen? Everything recovered just fine. But why?

Website Pin Facebook
Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment