Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Jun 2016 10:54:57 -0400
From:      Dan Langille <dan@langille.org>
To:        freebsd-scsi@freebsd.org
Subject:   Re: terminated ioc 804b scsi 0 state c xfer 0
Message-ID:  <FAF8CCD7-F525-44FD-A195-ACF7584F46A0@langille.org>
In-Reply-To: <DD765878-E988-4B11-B4E6-6E10FEC5B5BE@langille.org>
References:  <2E8752E5-76AF-4042-86D9-8C6733658A80@langille.org> <5EEF0794-B06E-4A72-89DA-7DCD94AE1FC6@langille.org> <072CEC8B-9392-4378-8DF5-63D05901850B@langille.org> <0d7401d19f10$ee329300$ca97b900$@broadcom.com> <F4C52A2E-AC8D-4A6D-BA93-0D96C9090251@langille.org> <068601d1bb57$f675f710$e361e530$@broadcom.com> <DD765878-E988-4B11-B4E6-6E10FEC5B5BE@langille.org>

next in thread | previous in thread | raw e-mail | index | archive | help
> On May 31, 2016, at 12:20 PM, Dan Langille <dan@langille.org> wrote:
>=20
>> On May 31, 2016, at 12:17 PM, Stephen McConnell =
<stephen.mcconnell@broadcom.com> wrote:
>>=20
>>=20
>>=20
>>> -----Original Message-----
>>> From: owner-freebsd-scsi@freebsd.org [mailto:owner-freebsd-
>>> scsi@freebsd.org] On Behalf Of Dan Langille
>>> Sent: Monday, May 30, 2016 12:28 PM
>>> To: freebsd-scsi@freebsd.org
>>> Subject: Re: terminated ioc 804b scsi 0 state c xfer 0
>>>=20
>>>> On Apr 25, 2016, at 12:38 PM, Stephen McConnell
>>> <stephen.mcconnell@broadcom.com> wrote:
>>>>=20
>>>>=20
>>>>=20
>>>>> -----Original Message-----
>>>>> From: owner-freebsd-scsi@freebsd.org [mailto:owner-freebsd-
>>>>> scsi@freebsd.org] On Behalf Of Dan Langille
>>>>> Sent: Monday, April 25, 2016 9:40 AM
>>>>> To: freebsd-scsi@freebsd.org
>>>>> Subject: Re: terminated ioc 804b scsi 0 state c xfer 0
>>>>>=20
>>>>>> On Apr 25, 2016, at 8:17 AM, Dan Langille <dan@langille.org> =
wrote:
>>>>>>=20
>>>>>>>=20
>>>>>>> On Apr 24, 2016, at 9:35 AM, Dan Langille <dan@langille.org> =
wrote:
>>>>>>>=20
>>>>>>> More of the pasted output is also at
>>>>> https://gist.github.com/dlangille/1fa3135334089c6603e2ec5da946d9ae
>>>>> =
<https://gist.github.com/dlangille/1fa3135334089c6603e2ec5da946d9ae>;
>>>>> and added smartctl output.
>>>>>>>=20
>>>>>>> I have a FreeBSD 10.2-RELEASE-p14 box in which there is an LSI
>>>>>>> SAS2008
>>>>> card.  It's running a zfs root system.
>>>>>>>=20
>>>>>>> This morning the system was unresponsive via ssh. Attempts to =
log
>>>>>>> in at
>>>>> the console did not yield a password prompt.
>>>>>>>=20
>>>>>>> A power cycle brought the system online.  Inspecting
>>>>>>> /var/log/messages,
>>>> I
>>>>> found about 63,000 entries similar to those which appear below.
>>>>>>>=20
>>>>>>> zpool status of all are OK. A scrub is in progress for one pool
>>>>>>> (since
>>>> before
>>>>> this issue arose). da7 is in that pool.
>>>>>>>=20
>>>>>>>=20
>>>>>>> Apr 24 11:25:55 knew kernel: (da7:mps1:0:17:0): READ(10). CDB: =
28
>>>>>>> 00 8d 90 c6 18 00 00 10 00 length 8192 SMID 774 terminated ioc =
804b
>>>>>>> scsi
>>>>>>> 0 state c xfer 0 Apr 24 11:25:55 knew kernel: (da7:mps1:0:17:0):
>>>>>>> READ(10). CDB: 28 00 8b d9 97 70 00 00 20 00 length 16384 SMID =
614
>>>>>>> terminated ioc 804b scsi 0 state c xfer 0 Apr 24 11:25:55 knew
>>>>>>> kernel: (da7:mps1:0:17:0): READ(10). CDB: 28 00 8b d9 97 50 00 =
00
>>>>>>> 20
>>>>>>> 00 length 16384 SMID 792 terminated ioc 804b scsi 0 state c xfer =
0
>>>>>>> Apr 24 11:25:55 knew kernel: (da7:mps1:0:17:0): READ(10). CDB: =
28
>>>>>>> 00 8b d9 97 08 00 00 20 00 length 16384 SMID 974 terminated ioc
>>>>>>> 804b scsi 0 state c xfer 0 Apr 24 11:25:55 knew kernel:
>> (da7:mps1:0:17:0):
>>>>>>> READ(10). CDB: 28 00 8b 6f ef 50 00 00 08 00 length 4096 SMID =
674
>>>>>>> terminated ioc 804b scsi 0 state c xfer 0 Apr 24 11:25:55 knew
>>>>>>> kernel: (da7:mps1:0:17:0): WRITE(10). CDB: 2a 00 8b 0f a2 48 00 =
00
>>>>>>> 18
>>>>>>> 00 length 12288 SMID 177 terminated ioc 804b scsi 0 state c xfer
>>>>>>> 12288 Apr 24 11:25:55 knew kernel: (da7:mps1:0:17:0): READ(10). =
CDB:
>>>>>>> 28 00 ab 8f a1 38 00 00 08 00 length 4096 SMID 908 terminated =
ioc
>>>>>>> 804b scsi 0 state c xfer 0 Apr 24 11:25:56 knew kernel:
>>>>>>> (da7:mps1:0:17:0): READ(10). CDB: 28 00 8b d9 97 70 00 00 20 00
>>>>>>> length 16384 SMID 376 terminated ioc 804b scsi 0 state c xfer 0 =
Apr
>>>>>>> 24 11:25:56 knew kernel: (da7:mps1:0:17:0): READ(10). CDB: 28 00 =
8b
>>>>>>> d9 97 50 00 00 20 00 length 16384 SMID 172 terminated ioc 804b =
scsi
>>>>>>> 0 state c xfer 0
>>>>>>>=20
>>>>>>> Is this a cabling issue?  The drive is a SATA device (smartctl
>>>>>>> output
>>>> in the
>>>>> URL above).  Anyone familiar with these errors?
>>>>>>=20
>>>>>> This morning:
>>>>>>=20
>>>>>> 13410079654596185797  REMOVED      0     0     0  was /dev/da7p3
>>>>>>=20
>>>>>> At least I know i'm looking for Serial Number: 13Q8PNBYS
>>>>>>=20
>>>>>> =46rom the logs:
>>>>>>=20
>>>>>> Apr 25 05:34:50 knew kernel: da7 at mps1 bus 0 scbus1 target 17 =
lun
>>>>>> 0 Apr 25 05:34:50 knew kernel: da7: <ATA TOSHIBA DT01ACA3 ABB0> =
s/n
>>>>> 13Q8PNBYS detached
>>>=20
>>> Just for the record, this happened again this morning. Fixed by =
power
>> cycle.
>>>=20
>>> May 30 03:22:08 knew kernel: mps1: mpssas_prepare_remove: Sending =
reset
>>> for target ID 17 May 30 03:22:10 knew kernel: da7 at mps1 bus 0 =
scbus1
>> target
>>> 17 lun 0
>>> May 30 03:22:10 knew kernel: da7: <ATA TOSHIBA DT01ACA3 ABB0> s/n
>>> 13Q8PNBYS detached
>>> May 30 03:22:10 knew kernel: (da7:mps1:0:17:0): READ(10). CDB: 28 00 =
8c 5c
>>> 91 c0 00 00 08 00 length 4096 SMID 179 terminated ioc 804b scsi 0 =
state c
>> xfer
>>> 0 May 30 03:22:10 knew kernel: (da7:mps1:0:17:0): WRITE(10). CDB: 2a =
00 6b
>>> bf db a0 00 00 f0 00 length 122880 SMID 938 terminated ioc 804b scsi =
0
>> state c
>>> xf(da7:mps1:0:17:0): READ(10). CDB: 28 00 8c 5c 91 c0 00 00 08 00 =
May 30
>>> 03:22:10 knew kernel: er 122880
>>>=20
>> I just realized that you're using mps, not mpr.  The fix went into =
the mpr
>> driver, but not mps yet.  It'll have to be ported over to mps.
>=20
> This hit me again last night.  Same drive again.  Power cycle cleared =
it.
>=20
> Now I'm wondering if it's heat or dud drive related.

It might be the heat.  It recurred three times today.  I replaced the =
SATA cable after the third incident.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FAF8CCD7-F525-44FD-A195-ACF7584F46A0>