Date: Sun, 20 Dec 1998 19:47:51 -0700 (MST) From: "Kenneth D. Merry" <ken@plutotech.com> To: skynyrd@opus.cts.cwu.edu (Chris Timmons) Cc: asmodai@wxs.nl, freebsd-scsi@FreeBSD.ORG Subject: Re: Problem with SCSI-bus and high diskaccess? Message-ID: <199812210247.TAA94760@panzer.plutotech.com> In-Reply-To: <Pine.BSF.3.96.981220122334.9871A-100000@opus.cts.cwu.edu> from Chris Timmons at "Dec 20, 98 12:34:22 pm"
next in thread | previous in thread | raw e-mail | index | archive | help
Chris Timmons wrote... > > [moved to -scsi] > > I am assuming that you posted to -current because you just cvsupped the > latest 3.0-current bits and are running that. Else freebsd-scsi would be > a good list for this kind of problem (be sure to say what version of > FreeBSD you are running.) Yep, it's best to post to -scsi with SCSI problems. > Although I haven't seen this sort of thing with a fireball, it reeks of > quantum firmware. I have had similar problems with atlas-I and atlas-II > drives. So you will want to check your firmware revision and see if there > is something newer out at ftp.quantum.com (upgrading firmware is a fun way > to waste a day.) You might try www.dejanews.com and search for your drive > name and firmware revision. Chances are somebody else has already seen > and documented a similar problem if it is indeed the drives. Yes, it looks like a firmware issue, most likely. > There is also a slight possibility that you are using an old enough > version of FreeBSD and perhaps an integrated aha-2940UW on your mb (or > perhaps the dreaded rev e(?) of the pci card. In that case, Justin > committed fixes to the drivers months ago. Update your system. True, that could be it. From his later message, though, we know that that isn't the problem. > I find that IBM and Seagate drives work very well and come with firmware Yep, they definitely seem to write better firmware. > On Sun, 20 Dec 1998, Jeroen Ruigrok/Asmodai wrote: > > > Hi, > > > > I just want some thoughts on this: > > > > In the last 24 hours my workstation is been going nuts, whereas it has been > > running along nicely for weeks before. > > > > The things that kept me bugging today were these happy messages: > > > > Dec 20 10:51:14 chronias /kernel: Unexpected busfree. LASTPHASE == 0xa0 > > Dec 20 10:51:15 chronias /kernel: SEQADDR == 0x157 > > Dec 20 10:51:15 chronias /kernel: (da1:ahc0:0:1:0): SCB 0x5 - timed out while > > idle, LASTPHASE == 0x1, SEQADDR == 0xc > > Dec 20 10:51:15 chronias /kernel: (da1:ahc0:0:1:0): Queuing a BDR SCB > > Dec 20 10:51:15 chronias /kernel: (da1:ahc0:0:1:0): Bus Device Reset Message Sent > > Dec 20 10:51:15 chronias /kernel: (da1:ahc0:0:1:0): no longer in timeout, status > > = 353 This indicates that the drive is having trouble. The timed out while idle problem generally happens when the drive doesn't respond to a command in the specified period of time. The default read/write timeout in the da driver is 60 seconds, which should be way more than enough time for the drive to respond. When the drive timed out, we hit it over the head with a BDR to wake it up. Usually, that will wake the drive up and get things going again. > > After which page and swap process were running wild. > > > > The weird thing is, these HD's (two SCSI Quantum Fireballs) have been checked, the > > HA (AHA 2940UW) is likewise in good shape and the memory chips have been tested as > > well... The mainboard is also in good shape and is cooled by a few fans (3) which > > all work, including the one on the CPU. > > > > The circumstances when this happened was when I did a locate.updatedb at the same > > time as downloading a PDF file to the HD, and then I opened a mailbox at which > > time the whole time went haywire... > > > > Unfortunately I wasn't able to write down the pager messages since they went by > > with warpspeed 9 and they weren't logged in /var/log/messages =\ > > > > Does anybody have any ideas what to check to narrow down the problem? The problem is probably just: Quantum Firmware + High Load == Drive goes out to lunch. With Quantum disks, the problem generally happens under high load. Chris is right, the Atlas I and Atlas II had problems like this as well. The latest firmware for the Atlas II at least has solved some of the problems, but not all. I believe the LYK8 Atlas II firmware has mostly solved the "drive goes out to lunch" problem. It hasn't solved the problem that causes it to continually return queue full until we have reduced the number of tags to the minimum. That's why we have Atlas II quirk entries setting the minimum number of tags to 24. I know that we (Pluto) have had trouble with the Fireball ST drives in the past in certain situations. (can't remember exactly what those situations were) I believe, though, that the '0F0J' firmware worked reasonably well. Ken -- Kenneth Merry ken@plutotech.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812210247.TAA94760>