Date: Thu, 04 Jun 1998 12:12:30 -0400 (EDT) From: Simon Shapiro <shimon@simon-shapiro.org> To: Bob Willcox <bob@luke.pmr.com> Cc: Michael Hancock <michaelh@cet.co.jp>, "freebsd-current@freebsd.org" <freebsd-current@FreeBSD.ORG>, tcobb <tcobb@staff.circle.net>, Karl Pielorz <kpielorz@tdx.co.uk>, Mike Smith <mike@smith.net.au>, Greg Lehey <grog@lemis.com> Subject: Re: DPT driver fails and panics with Degraded Array Message-ID: <XFMail.980604121230.shimon@simon-shapiro.org> In-Reply-To: <19980603073200.A16652@pmr.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 03-Jun-98 Bob Willcox wrote: ... >> Why would a driver call biodone on a buffer that doens't belong to it? > > Probably not relavent, but in the DPT device driver that I wrote for AIX > I had to put some pretty ugly validity checks in the interrupt code to > prevent my driver from trying to do an iodone (AIX's version of biodone) > on already completed (or purged, I don't remember for sure...its been > over a year now) commands. Seems that the DPT firmware would (on > occasion) interrupt with a status packet that pointed to a ccb that my > driver had already completed. As I recall this would only happen under > heavy load and it was pretty intermittant. As far as I know, it was > never actually fixed. The FreeBSD driver actually does exactly that. I encountered exactly that situation in earlier firmware revisions (7H1 or so). I put more defenses in the driver than necessary. Later revisions of the firmware (7L0 or so) took care of the problem, but the defensive code stayed, as #ifdef`s. Many of these problems are actually (arguabbly?) induced by timing problems on the PCI bus. Certain PCI-PCI bridges (or even motherboard ``main'' chipsets will deliver interrupts, I/O bus transactions and memory transactions out of order when hammered very rapidly, under heavy load, or both. We proved it clearly with certain ``industrial'' computers, and certain motherboards, by making the symptoms go away (or drastically change) as you move the DPT, video cards, Ethernet cards, etc. from slot to slot. If one is really paranoid, one can enable DPT_VERIFY_HINTR to get this code back. Even more severe cases of paranoia can be satisfied by enabling DPT_HANDLE_TIMEOUTS. For those who are as sick as I am, you can define an DPT_INTR_DELAY as some small integer. What these do is, in the order listed: DPT_VERIFY_HINTR: Mark and stamp each CCB so as to guarantee that it is not handled twice. DPT_HANDLE_TIMEOUTS: turn on elaborate mechanism that will track transactions (CCBs) that seem to linger on beyond their useful life. DPT_INTR_DELAY: Will cause the interrupt service routine to spin a little bit, giving the hardware chance to settle a bit before dpt_intr gets all excited about it. Simon --- Sincerely Yours, Simon Shapiro Shimon@Simon-Shapiro.ORG 770.265.7340 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980604121230.shimon>