Date: Thu, 3 Mar 2016 10:09:37 -0700 From: Scott Long <scott4long@yahoo.com> To: Borja Marcos <borjam@sarenet.es> Cc: Steven Hartland <killing@multiplay.co.uk>, FreeBSD-scsi <freebsd-scsi@freebsd.org> Subject: Re: mpr(4) SAS3008 Repeated Crashing Message-ID: <F5E05621-FF84-4BED-B1A7-3252715CD53B@yahoo.com> In-Reply-To: <F9B68610-12C6-4D32-88CA-A34A185F9AD1@sarenet.es> References: <56D5FDB8.8040402@freebsd.org> <56D612FA.6090909@multiplay.co.uk> <A8859ECA-0B58-42A8-AA49-DF6AA3D52CC6@sarenet.es> <E74F5225-1EA8-4B60-ADDC-7B13E1003184@yahoo.com> <D7E0BCCE-EB44-4EF9-8F17-474C162F7D7C@sarenet.es> <56D805FD.50500@multiplay.co.uk> <F9B68610-12C6-4D32-88CA-A34A185F9AD1@sarenet.es>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Mar 3, 2016, at 3:26 AM, Borja Marcos <borjam@sarenet.es> wrote: >=20 >=20 >> On 03 Mar 2016, at 10:38, Steven Hartland <killing@multiplay.co.uk> = wrote: >>=20 >> We've seen HW issues before where the first thing to start triggering = the problem was TRIM requests, it seems like its an afterthought in most = FW's unfortunately, so one of the first things to go bad. I'm not saying = this is you issue, but its something to keep in mind. >=20 > Thanks :) >=20 > Not trim related, it seems. I=E2=80=99ve ran the tests with = kern.cam.X.delete_method set to DISABLE and I still see errors: >=20 > In paranormal cases like this it would be awesome to have access to a = logic analyzer=E2=80=A6 I keep dreaming of course :) >=20 >=20 >=20 > Mar 3 11:12:53 clientes-ssd8 kernel: (noperiph:mpr0:0:4294967295:0): = SMID 2 Aborting command 0xfffffe0000c7cab0 > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): READ(10). = CDB: 28 00 26 d7 d0 98 00 00 20 00 length 16384 SMID 322 terminated ioc = 804b scsi 0 state c xfer 0 > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): READ(10). = CDB: 28 00 26 d7 d0 b8 00 00 18 00 length 12288 SMID 217 terminated ioc = 804b scsi 0 state c xfer 0 > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): SYNCHRONIZE = CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 205 = terminated ioc 804b scsi 0 sta(da2:mpr0:0:26:0): READ(10). CDB: 28 00 26 = d7 d0 18 00 00 78 00=20 > Mar 3 11:12:54 clientes-ssd8 kernel: te c xfer 0 > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): CAM status: = Command timeout > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): Retrying = command > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): SYNCHRONIZE = CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00=20 > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): CAM status: = SCSI Status Error > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): SCSI status: = Check Condition > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): SCSI sense: = UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) > Mar 3 11:12:54 clientes-ssd8 kernel: (da2:mpr0:0:26:0): Retrying = command (per sense data) >=20 SYNC CACHE seems to have been involved this time, and while it=E2=80=99s = sometimes a source of trouble with SATA disks, I=E2=80=99m very hesitant = to blame it. Given the seemingly random nature of your problems, I=E2=80=99= m not as certain anymore to rule out a fault of the disk enclosure. = This looks to be a different disk than your last report, and your = statement that a sibling system exhibits no problems is very = interesting. Maybe there=E2=80=99s an issue with the power supply, and = the disks are getting under-voltage conditions periodically. If you can = run smartctl against the disks, the output might be useful. Also, if = you=E2=80=99re able, could you make sure that both this system and the = one that is working well are being fed with sufficient and similar AC = power? And if the power supply modules in your enclosures are = swappable, maybe swap them between systems and see if the problem = follows the module? If that doesn=E2=80=99t fix it then I=E2=80=99ll = think of ways to provide more instrumentation. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F5E05621-FF84-4BED-B1A7-3252715CD53B>