Date: Wed, 09 Dec 1998 15:28:28 -0800 From: Julian Elischer <julian@whistle.com> To: Charlie Root <root@bigwoop.vicor-nb.com> Cc: cayford@bigwoop.vicor-nb.com, conor@bigwoop.vicor-nb.com, daver@bigwoop.vicor-nb.com, jason@idiom.com, larry@bigwoop.vicor-nb.com, lorraine@idiom.com, scott@bigwoop.vicor-nb.com, scsi@FreeBSD.ORG Subject: Re: any recovery from these disk errors? Message-ID: <366F079C.31DFF4F5@whistle.com> References: <199812092300.PAA00404@bigwoop.vicor-nb.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Charlie Root wrote: > > We've been getting lots of scsi errors on freebsd here at vicor. From > what I can tell, they are unrecoverable. I'm hoping that you'll have > some additional insight/ideas to try. > > We tried moving the disk to another machine -- same errors; we > tried to change the bios settings which also didn't help. > > thanks > > -lorraine Venner? > > Dec 9 09:22:50 bigwoop /kernel.old: sd2(ahc0:2:0): NOT READY asc:4,1 > Dec 9 09:22:50 bigwoop /kernel.old: sd2(ahc0:2:0): Logical unit is in process of becoming ready field replaceable unit: 2 > Dec 9 09:22:50 bigwoop /kernel.old: , retries:2 are there other retries other than 2? it abandos the operation after 4 retries so if you only see a 2nd retry it probably got it's act together and successfully completed the operation. This is under what? 2.2.7? The device being not ready is interesting.. the DRIVE itself supplies this information... what kind of device is it? One of the RAIDs? Possibly it's busy handling a bad-block remapping or something. > Dec 9 09:22:50 bigwoop /kernel.old: sd2(ahc0:2:0): NOT READY asc:4,1 > Dec 9 09:22:50 bigwoop /kernel.old: sd2(ahc0:2:0): Logical unit is in process of becoming ready field replaceable unit: 2 > > Dec 9 09:22:51 bigwoop /kernel.old: , FAILURE > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): SCB 0x2 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 This was a known bug in the aha driver at one stage, not usually 'fatal' however. Justin Gibbs (the author) is busy making his new stuff work in 3.0 so he's rather orphanned the 2.2 code but he does respond to bug reports. > Dec 9 09:22:51 bigwoop /kernel.old: SEQADDR = 0x5 SCSISEQ = 0x12 SSTAT0 = 0x0 SSTAT1 = 0xa > Dec 9 09:22:51 bigwoop /kernel.old: Ordered Tag queued That's a normal operation.. I wonder why it reported it. > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): SCB 0x2 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > Dec 9 09:22:51 bigwoop /kernel.old: SEQADDR = 0x6 SCSISEQ = 0x12 SSTAT0 = 0x0 SSTAT1 = 0xa > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): Queueing an Abort SCB > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): Abort Message Sent > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): SCB 0x2 - timed out in message out phase, SCSISIGI == 0xa4 > Dec 9 09:22:51 bigwoop /kernel.old: SEQADDR = 0xa1 SCSISEQ = 0x12 SSTAT0 = 0x5 SSTAT1 = 0x2 > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): no longer in timeout > Dec 9 09:22:51 bigwoop /kernel.old: ahc0: Issued Channel A Bus Reset. 1 SCBs aborted > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): UNIT ATTENTION asc:29,0 > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): Power on, reset, or bus device reset occurred field replaceable unit: 1 > Dec 9 09:22:51 bigwoop /kernel.old: , retries:2 > Dec 9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): NOT READY asc:4,1 > Dec 9 09:22:52 bigwoop /kernel.old: sd2(ahc0:2:0): Logical unit is in process of becoming ready field replaceable unit: 2 > > ... . > does "..." mean that you got the same error again immediatly or more of the same after some delay? > thanks again... To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?366F079C.31DFF4F5>