Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 31 Jan 1996 21:46:55 -0800 (PST)
From:      Jaye Mathisen <mrcpu@cdsnet.net>
To:        Joe Greco <jgreco@solaria.sol.net>
Cc:        hackers@freebsd.org
Subject:   Re: No SCSI recovery - yet another gripe
Message-ID:  <Pine.BSF.3.91.960131214633.15664W-100000@schizo.cdsnet.net>
In-Reply-To: <199602010504.XAA28690@solaria.sol.net>

next in thread | previous in thread | raw e-mail | index | archive | help

Justin mentioned one time that he was working on the recovery code.  It's 
doing the same thing for me as well.

On Wed, 31 Jan 1996, Joe Greco wrote:

> This is the second time this week my news box has frozen with a SCSI error
> of some sort on the screen.  This time:
> 
> ahc1: target 3, lun0 (sd23) timed out
> sd23(aha1:3:0): BUS DEVICE RESET message queued.
> ahc1:A:3: no active SCB for reconnecting target - issuing ABORT
> SAVED_TCL = 0x30
> ahc1: target 3, lun0 (sd23) timed out
> _
> 
> The SCSI system works GREAT when all is fine and dandy.  However, this sort
> of error "recovery" sucks - a panic and reboot is preferable to a dead
> freeze.
> 
> In all reality I believe it has something to do with the relative
> reliability of drive power connectors and the likelihood that all 14 of them
> that are on news.sol.net work perfectly is less than 100%...  so I will
> tackle the problem from a hardware standpoint, as I believe that the source
> is a loose power connection somewhere.  On the other hand, consider this a
> plea for the SCSI gods to improve the error handling somehow!  I hear great
> games talked on -hackers and all, layered device independent error handling,
> etc...  a free beer to the person(s) who implement(s) it.  ;-)
> 
> For kicks, I have been known to take a SCSI disk and unplug it from a
> Solaris based system while the system is running.  The grace with which it
> attempts to deal with the crisis is admirable.  Sometimes the system even
> continues to work if I plug the drive back in...  :-)  I don't expect that
> anybody has the time or effort to spare to implement error recovery to this
> sort of level, but the current "lock'n'hang" is a little too far to the
> opposite extreme...  
> 
> Thanks and good evening,
> 
> ... Joe
> 
> -------------------------------------------------------------------------------
> Joe Greco - Systems Administrator			      jgreco@ns.sol.net
> Solaria Public Access UNIX - Milwaukee, WI			   414/342-4847
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.91.960131214633.15664W-100000>