Date: Thu, 1 Feb 1996 10:54:42 -0600 (CST) From: Joe Greco <jgreco@brasil.moneng.mei.com> To: gibbs@freefall.freebsd.org (Justin T. Gibbs) Cc: hackers@FreeBSD.org Subject: Re: No SCSI recovery - yet another gripe Message-ID: <199602011654.KAA09004@brasil.moneng.mei.com> In-Reply-To: <199602010525.VAA23989@freefall.freebsd.org> from "Justin T. Gibbs" at Jan 31, 96 09:25:17 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> >This is the second time this week my news box has frozen with a SCSI error > >of some sort on the screen. This time: > > > >ahc1: target 3, lun0 (sd23) timed out > >sd23(aha1:3:0): BUS DEVICE RESET message queued. > >ahc1:A:3: no active SCB for reconnecting target - issuing ABORT > >SAVED_TCL = 0x30 > >ahc1: target 3, lun0 (sd23) timed out > > Yup. The error recovery code in the aic7xxx driver is especially > bad because it has not been updated to match the recent stability > fixes in the driver. Bluhhhck. :-/ :-( > >On the other hand, consider this a > >plea for the SCSI gods to improve the error handling somehow! I hear great > >games talked on -hackers and all, layered device independent error handling, > >etc... a free beer to the person(s) who implement(s) it. ;-) > > This will happen before 2.2 ships. PowerPoint is nearing code complete, > so my time is limited for for another 7 days or so, but after that, > my nights will be devoted to these problems. The entire generic SCSI > layer is in for a revamp with extra detail going toward error recovery > and performance. Cool! :-) > >For kicks, I have been known to take a SCSI disk and unplug it from a > >Solaris based system while the system is running. The grace with which it > >attempts to deal with the crisis is admirable. Sometimes the system even > >continues to work if I plug the drive back in... :-) I don't expect that > >anybody has the time or effort to spare to implement error recovery to this > >sort of level, > > We need this level of robustness in order to be taken seriously IMHO. Yes, I think so too, but then again, I realize this is a volunteer operation and I would not be unhappy with a less than preferable action such as a panic. I do think that this is something that needs to be addressed before FreeBSD is likely to over the world. ;-) > As they say, "shit happens" on SCSI busses as well as as in real life. > Luckily we can anticipate what kinds of things will hit the fan with SCSI > and hopefully do everything possible to recover. My main concern is > sufficient driver level documentation to make the error recovery reliable. > I have all I need for the Adaptec aic7xxx cards since I control the > firmware (Stephan I'm sure is in the same boat with the NCR), but for cards > like the Buslogic and Ultrastore, I just don't know how well we can do. Again, one can only ask so much out of volunteers. We can try to support LOTS of hardware and that is GOOD. On the other hand, there is also nothing wrong with certifying certain hardware as "FreeBSD Blessed" and therefore saying it is preferable to use well documented hardware that is fully supported over poorly documented hardware that got reverse-engineered. Anyways, thanks and good luck. ... Joe ------------------------------------------------------------------------------- Joe Greco - Systems Administrator jgreco@ns.sol.net Solaria Public Access UNIX - Milwaukee, WI 414/546-7968
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602011654.KAA09004>