Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Jan 2012 10:41:27 +0100
From:      Peter Maloney <peter.maloney@brockmann-consult.de>
To:        freebsd-fs@freebsd.org
Subject:   Re: sanity check:  is 9211-8i, on 8.3, with IT firmware still "the one"
Message-ID:  <4F211FC7.3080709@brockmann-consult.de>
In-Reply-To: <4F197F8D.7010404@brockmann-consult.de>
References:  <alpine.BSF.2.00.1201191604510.19710@kozubik.com>		<4F192ADA.5020903@brockmann-consult.de>	<1327069331.29444.4.camel@btw.pki2.com> <4F197F8D.7010404@brockmann-consult.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On 01/20/2012 03:51 PM, Peter Maloney wrote:
> On 01/20/2012 03:22 PM, Dennis Glatting wrote:
>> I am having a problem with Seagate ST1000DL002 disks but I haven't yet
>> determined weather it is the disks themselves (they -- two of them, new
>> -- fail under a MB controller too.
> I happen to have some ST2000DL003 disks on hand (same as yours, but 2TB
> instead of 1, and I don't know what firmware)... I could try my hot pull
> test with them to see what happens.
Update: I tested it, and it fails much like the Crucial SSD with old
firmware, except:

-with the SSD, I could still use smartctl to see the disk afterwards,
but not with the Seagate Green. (I didn't verify this with the SSD, but
I think it has a /dev/da# device, but the Seagate does not)
-the Seagate Green never comes back at all, but the SSD which is
reported as coming back, but has an error "daasync: Unable to attach to
new device due to status 0x6" which makes the disk unusable until reboot

So in the distant future, I will test newest firmware (currently using
firmware CC45 I think, and yours is CC32), then send some email to
Seagate about it. And in the near future, I will not be using those
disks in ZFS.

Your disk is firmware CC32 I would assume:

da12: <ATA ST1000DL002-9TT1 *CC32*> Fixed Direct Access SCSI-6 device



Seagate Green

(insert device)
Jan 26 09:52:28 bcnas1bak kernel: mpssas_get_sas_address_for_sata_disk:
got SATA identify successfully for handle = 0x21 with try_count = 1
Jan 26 09:52:28 bcnas1bak kernel: SAS Address for SATA device =
1f605d2f7e735344
Jan 26 09:52:28 bcnas1bak kernel: mpssas_get_sas_address_for_sata_disk:
got SATA identify successfully for handle = 0x21 with try_count = 1
Jan 26 09:52:28 bcnas1bak kernel: da20 at mpslsi0 bus 0 scbus0 target 55
lun 0
Jan 26 09:52:28 bcnas1bak kernel: da20: <ATA ST2000DL003-9VT1 CC45>
Fixed Direct Access SCSI-6 device
Jan 26 09:52:28 bcnas1bak kernel: da20: 600.000MB/s transfers
Jan 26 09:52:28 bcnas1bak kernel: da20: Command Queueing enabled
Jan 26 09:52:28 bcnas1bak kernel: da20: 1907729MB (3907029168 512 byte
sectors: 255H 63S/T 243201C)
(insert another device, make a mirror vdev)
(pull device while writing to it)
Jan 26 09:53:53 bcnas1bak kernel: mpslsi0: mpssas_alloc_tm freezing simq
Jan 26 09:53:53 bcnas1bak kernel: mpslsi0: mpssas_lost_target targetid 55
Jan 26 09:53:53 bcnas1bak kernel: (da20:mpslsi0:0:55:0): lost device
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d 9c 8b 0 1 0 0 length 131072 SMID 232 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d a2 8b 0 1 0 0 length 131072 SMID 856 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d 9b 8b 0 1 0 0 length 131072 SMID 813 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d a1 8b 0 1 0 0 length 131072 SMID 626 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d a0 8b 0 1 0 0 length 131072 SMID 141 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d 9f 8b 0 1 0 0 length 131072 SMID 250 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d 9e 8b 0 1 0 0 length 131072 SMID 734 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d 9d 8b 0 1 0 0 length 131072 SMID 531 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d a3 8b 0 1 0 0 length 131072 SMID 260 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): WRITE(10). CDB:
2a 0 0 d 9a 8b 0 1 0 0 length 131072 SMID 503 terminated ioc 804b scsi 0
state c xfer 0
Jan 26 09:53:54 bcnas1bak kernel: mpslsi0: IOCStatus = 0x4b while
resetting device 0x21
Jan 26 09:53:54 bcnas1bak kernel: mpslsi0: mpssas_free_tm releasing simq
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): Synchronize
cache failed, status == 0xa, scsi status == 0x0
Jan 26 09:53:54 bcnas1bak kernel: (da20:mpslsi0:0:55:0): removing device
entry
(put device back in)
(no further logs)

And then I tried:
camcontrol reset 0:55:0 (or 0:0:55? forget where the 0 goes) and it said
there was no device.
camcontrol reset 0:54:0 (this is the other disk of the same type that I
had in the same test mirror vdev), and the kernel panicked, and this
appeared in /var/log/messages:
Jan 26 09:57:14 bcnas1bak kernel: mpslsi0: mpssas_action XPT_RESET_DEV



Crucial SSD with firmware 0001

(pull device while writing to it)
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): CAM status:
SCSI Status Error
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): SCSI status:
Check Condition
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): SCSI sense:
ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f2 19 0 0 ff 0 length 130560 SMID 292 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f4 d3 0 0 9e 0 length 80896 SMID 426 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f5 71 0 0 cf 0 length 105984 SMID 978 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f6 40 0 0 b2 0 length 91136 SMID 695 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f6 f2 0 0 9f 0 length 81408 SMID 792 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f3 df 0 0 f4 0 length 124928 SMID 615 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f3 18 0 0 c7 0 length 101888 SMID 645 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 c2 83 ec 0 0 8 0 length 4096 SMID 163 terminated ioc 804b scsi 0
state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f8 61 0 0 b3 0 length 91648 SMID 222 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f9 14 0 0 ed 0 length 121344 SMID 651 terminated ioc 804b scsi
0 state 0 xfer 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): READ(10). CDB:
28 0 0 ce f1 91 0 0 1c 0
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): CAM status:
SCSI Status Error
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): SCSI status:
Check Condition
Jan 19 14:37:16 bcnas1bak kernel: (da20:mpslsi0:0:46:0): SCSI sense:
ABORTED COMMAND asc:47,3 (Information unit iuCRC error detected)
Jan 19 14:40:05 bcnas1bak kernel: (da20:mpslsi0:0:46:0): lost device
Jan 19 14:40:05 bcnas1bak kernel: mpslsi0: Reset aborted 21 commands
Jan 19 14:40:05 bcnas1bak kernel: mpslsi0: clearing target 46 handle 0x0024
Jan 19 14:40:05 bcnas1bak kernel: mpslsi0: mpssas_remove_complete on
handle 0x0024, IOCStatus= 0x0
Jan 19 14:40:05 bcnas1bak kernel: mpslsi0: mpssas_free_tm releasing simq
Jan 19 14:40:05 bcnas1bak kernel: (da20:mpslsi0:0:46:0): Synchronize
cache failed, status == 0x39, scsi status == 0x0
Jan 19 14:40:05 bcnas1bak kernel: (da20:mpslsi0:0:46:0): removing device
entry
(put device back in)
Jan 19 14:41:32 bcnas1bak kernel: mpssas_get_sas_address_for_sata_disk:
got SATA identify successfully for handle = 0x24 with try_count = 1
Jan 19 14:41:32 bcnas1bak kernel: SAS Address for SATA device =
d828161ba16c7889
Jan 19 14:41:33 bcnas1bak kernel: mpssas_get_sas_address_for_sata_disk:
got SATA identify successfully for handle = 0x24 with try_count = 1
Jan 19 14:41:33 bcnas1bak kernel: da20 at mpslsi0 bus 0 scbus0 target 46
lun 0
Jan 19 14:41:33 bcnas1bak kernel: da20: <ATA M4-CT256M4SSD2 0009> Fixed
Direct Access SCSI-6 device
Jan 19 14:41:33 bcnas1bak kernel: da20: 600.000MB/s transfers
Jan 19 14:41:33 bcnas1bak kernel: da20: Command Queueing enabled
Jan 19 14:41:33 bcnas1bak kernel: da20: 244198MB (500118192 512 byte
sectors: 255H 63S/T 31130C)
Jan 19 14:41:42 bcnas1bak kernel: pid 19175 (gpart), uid 0: exited on
signal 11 (core dumped)
Jan 19 14:42:30 bcnas1bak kernel: mpssas_get_sas_address_for_sata_disk:
got SATA identify successfully for handle = 0x18 with try_count = 1
Jan 19 14:42:30 bcnas1bak kernel: SAS Address for SATA device =
d828161ba16c748a
Jan 19 14:42:30 bcnas1bak kernel: mpssas_get_sas_address_for_sata_disk:
got SATA identify successfully for handle = 0x18 with try_count = 1
Jan 19 14:42:31 bcnas1bak kernel: cam_periph_alloc: attempt to
re-allocate valid device da10 rejected
*Jan 19 14:42:31 bcnas1bak kernel: daasync: Unable to attach to new
device due to status 0x6*
(no further logs)

> What sort of failure is happening?
>
> Do you use a ZIL on a device other than an ST1000DL002?
>
> Please send output of
> smartctl -i
>
> (particularly interested in firmware version)
>


-- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F211FC7.3080709>