Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 09 Dec 1998 15:28:28 -0800
From:      Julian Elischer <julian@whistle.com>
To:        Charlie Root <root@bigwoop.vicor-nb.com>
Cc:        cayford@bigwoop.vicor-nb.com, conor@bigwoop.vicor-nb.com, daver@bigwoop.vicor-nb.com, jason@idiom.com, larry@bigwoop.vicor-nb.com, lorraine@idiom.com, scott@bigwoop.vicor-nb.com, scsi@FreeBSD.ORG
Subject:   Re: any recovery from these disk errors?
Message-ID:  <366F079C.31DFF4F5@whistle.com>
References:  <199812092300.PAA00404@bigwoop.vicor-nb.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Charlie Root wrote:
> 
> We've been getting lots of scsi errors on freebsd here at vicor.  From
> what I can tell, they are unrecoverable.  I'm hoping that you'll have
> some additional insight/ideas to try.
> 
> We tried moving the disk to another machine -- same errors;  we
> tried to change the bios settings which also didn't help.
> 
> thanks
> 
> -lorraine

Venner?



> 
> Dec  9 09:22:50 bigwoop /kernel.old: sd2(ahc0:2:0): NOT READY asc:4,1
> Dec  9 09:22:50 bigwoop /kernel.old: sd2(ahc0:2:0):  Logical unit is in process of becoming ready field replaceable unit: 2
> Dec  9 09:22:50 bigwoop /kernel.old: , retries:2

are there other retries other than 2?

it abandos the operation after 4 retries so if you only see a 2nd 
retry it probably got it's act together and successfully completed 
the operation. This is under what? 2.2.7?

The device being not ready is interesting.. the DRIVE itself 
supplies this information... what kind of device is it? One of 
the RAIDs? Possibly it's busy handling a bad-block remapping or
something.

> Dec  9 09:22:50 bigwoop /kernel.old: sd2(ahc0:2:0): NOT READY asc:4,1
> Dec  9 09:22:50 bigwoop /kernel.old: sd2(ahc0:2:0):  Logical unit is in process of becoming ready field replaceable unit: 2
> 
> Dec  9 09:22:51 bigwoop /kernel.old: , FAILURE
> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): SCB 0x2 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0

This was a known bug in the aha driver at one stage, not usually 'fatal'
however. Justin Gibbs (the author) is busy making his new stuff work in
3.0 so he's rather orphanned the 2.2 code but he does respond to bug
reports.


> Dec  9 09:22:51 bigwoop /kernel.old: SEQADDR = 0x5 SCSISEQ = 0x12 SSTAT0 = 0x0 SSTAT1 = 0xa
> Dec  9 09:22:51 bigwoop /kernel.old: Ordered Tag queued


That's a normal operation.. I wonder why it reported it.


> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): SCB 0x2 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
> Dec  9 09:22:51 bigwoop /kernel.old: SEQADDR = 0x6 SCSISEQ = 0x12 SSTAT0 = 0x0 SSTAT1 = 0xa
> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): Queueing an Abort SCB
> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): Abort Message Sent
> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): SCB 0x2 - timed out in message out phase, SCSISIGI == 0xa4
> Dec  9 09:22:51 bigwoop /kernel.old: SEQADDR = 0xa1 SCSISEQ = 0x12 SSTAT0 = 0x5 SSTAT1 = 0x2
> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): no longer in timeout
> Dec  9 09:22:51 bigwoop /kernel.old: ahc0: Issued Channel A Bus Reset. 1 SCBs aborted
> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): UNIT ATTENTION asc:29,0
> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0):  Power on, reset, or bus device reset occurred field replaceable unit: 1
> Dec  9 09:22:51 bigwoop /kernel.old: , retries:2
> Dec  9 09:22:51 bigwoop /kernel.old: sd2(ahc0:2:0): NOT READY asc:4,1
> Dec  9 09:22:52 bigwoop /kernel.old: sd2(ahc0:2:0):  Logical unit is in process of becoming ready field replaceable unit: 2
> 
> ... .
> 

does "..." mean that you got the same error again immediatly or more of
the same after some delay?


> thanks again...

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?366F079C.31DFF4F5>