From owner-freebsd-stable Tue May 26 18:52:38 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id SAA12321 for freebsd-stable-outgoing; Tue, 26 May 1998 18:52:38 -0700 (PDT) (envelope-from owner-freebsd-stable@FreeBSD.ORG) Received: from dingo.cdrom.com (dingo.cdrom.com [204.216.28.145]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id SAA12212 for ; Tue, 26 May 1998 18:52:22 -0700 (PDT) (envelope-from mike@dingo.cdrom.com) Received: from dingo.cdrom.com (localhost [127.0.0.1]) by dingo.cdrom.com (8.8.8/8.8.5) with ESMTP id RAA02455; Tue, 26 May 1998 17:46:05 -0700 (PDT) Message-Id: <199805270046.RAA02455@dingo.cdrom.com> X-Mailer: exmh version 2.0zeta 7/24/97 To: Michael Robinson cc: mike@smith.net.au, nate@mt.sri.com, freebsd-stable@FreeBSD.ORG Subject: Re: Bug in wd driver In-reply-to: Your message of "Wed, 27 May 1998 09:26:18 +0800." <199805270126.JAA20637@public.bta.net.cn> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 26 May 1998 17:46:05 -0700 From: Mike Smith Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk > Mike Smith writes: > >I'm sorry Nate, but if it was a bad spot the error register would be > >nonzero. Please check the originally quoted diagnostic for the actual > >status/error register values. Also note that DRQ would not be set if > >the timeout had occurred too soon. > > This is from the originally quoted diagnostic: > > wd0: interrupt timeout > wd0: status 50 error 1 > > I am not an expert, but it looks to me like a nonzero error code. ERR is not set in the status register. As far as I can tell, ATA4 (T13/ 1153D revision 16, which is the only reference I have to hand) clause 7.15.6.6 says "When the ERR bit is cleared to zero at the end of a command: a) the content of the error register shall be ignored by the host.". There are other conditions that could cause ERR to be cleared, and there are other anomalies. The error report I was referring to when I was studying the reference earlier gave the status as 0x58 and error as zero. This indicates data ready to be transferred and no error. 0x50 indicates no data, ready for a command, and no error. > >Actually, it eventually gives up. (Check the source if you don't > >believe me.) > > I looked at the source, and it gives up after five errors. Unfortunately, > the driver only gets to the second retry before it wedges itself. I was actually referring to Nate's problem here. He threw in a relatively unrelated situation where he had a normal error in a critical disk region. > As for unwedging itself, this seems to be pretty suspicious: Yes. If the disk fails to interrupt, and continues to fail to interrupt, the caller will remain wedged. This is unarguably a defect, and if you believe you have reason to want to rework this, please talk with Soren (sos@freebsd.org) so that you can coordinate your efforts. -- \\ Sometimes you're ahead, \\ Mike Smith \\ sometimes you're behind. \\ mike@smith.net.au \\ The race is long, and in the \\ msmith@freebsd.org \\ end it's only with yourself. \\ msmith@cdrom.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message