Date: Wed, 25 Feb 2004 14:50:19 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: Julien Gabel <jpeg@thilelli.net> Cc: freebsd-current@freebsd.org Subject: Re: Stray irq7. Message-ID: <20040225115747.O9312@gamplex.bde.org> In-Reply-To: <53996.192.168.0.97.1077661722.squirrel@webmail.thilelli.net> References: <20040222185325.GA97979@cserv62.csub.edu> <53996.192.168.0.97.1077661722.squirrel@webmail.thilelli.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 24 Feb 2004, Julien Gabel wrote: > >> Getting this message at boot, with yesterday's CURRENT, after disk > >> detection. > >> stray irq7 > >> ... > >> too many stray irq 7's: not logging anymore > > > This is most likely either a symptom of the brokenness of the > > x86's ISA controller or you've disabled the parallel port driver. > > If all the hardware you care about works, you can ignore this > > message. Er, you mean the correctness of the x86's ISA controller (it reports problems if it detects them). But this is out of date. Stray irqs are now all due to software bugs; glitches in hardware interrupts are now mishandled as follows: case 1: no ithread for irq7/15 Then the glitches are detected and silently ignored. Even counting of them is broken. The detection is a relatively new feature in -current, but the mishandling became worse with the detection. The correct handling is to broadcast interrupts for hardware glitches to all interrupt handlers (except ones like clkintr() that can't handle interrupts which are not for them). case 2: ithread for irq7/15 with no handlers Then the glitches are not detected and are bogusly reported as stray interrupts. But most such reports are probably due to software bugs causing normal interrupts. The existence of this case is a software bug. At least before the relative recent interrupt handling changes, the lpt driver caused normal interrupts that are reported as stray ones, as a result of the following 3 bugs: (a) lpt tears down and sets up its interrupt handler (in this order IIRC) for every write. (b) lpt doesn't wait for previous interrupts to arrive before tearing down the handler. (c) Step (a) is potentially very costly, since it should cause the ithread to go away if it has no other handlers, which is the usual case for lpt (I think the ppbus level should hang on to the ithread, but it apparently doesn't). However, the ithread stays around and its interrupt remains unmasked. Sometimes an interrupt for (b) arrives in the window betwen setup and teardown in (a). Such interrupts are reported as "stray". case 3: ithread for irq7/15 with at least one handler Then the glitches are not detected and stray interrupts are sent to the handler(s). The handlers should ignore them. Most handlers have no problems ignoring interrupts that are not for them, since they need to do this anyway for shared interrupts. This is for -current. -stable is simpler and less buggy. > Just 'for memory' there is a FAQ entry for that, but the question > was already well answered :) > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/\ > troubleshoot.html#STRAY-IRQ This was never quite right, and is very out of date for -current: % 5.23. What does ``stray IRQ'' mean? % % Stray IRQs are indications of hardware IRQ glitches, mostly from % hardware that removes its interrupt request in the middle of the % interrupt request acknowledge cycle. Actually, in -stable they are indications of hardware IRQ glitches and software bugs. In -current, they mostly indicate software bugs (they only indicate a hardware irq glitch if losing races like the one in Case 2 coincide with a hardware irq glitch). Also, until relatively recently in -current, there was a race setting up interrupts (on i386's but not on alphas at least) which caused a normal interrupt that was present at interrupt setup time to be at least recorded as a stray one (IIRC, it was correctly sent to the handler but misrecorded because the race was only in setting up the interrupt name and counter). Old ISA devices with tri-state line drivers tend to always cause such an interrupts. Thus there was almost always a stray irq6. % % One has three options for dealing with this: % * Live with the warnings. All except the first 5 per irq are % suppressed anyway. This is still correct :-). % * Break the warnings by changing 5 to 0 in isa_strayintr() so that % all the warnings are suppressed. In -current, there is no such function as isa_strayintr(). Until relatively, it existed but was only used in the unusual case that there is no ithread (previous version of Case 1). I already knew too much about this bug suite, but learned more investigating why isa_strayintr() was almost never called :-). Most reports of stray interrupts came from sched_ithd(). The reports are now centralized in intr_execute_handlers(): Some other bugs were fixed and introduced by merging the reporting: - "5" was spelled "MAX_STRAY_LOG" in sched_ithd(). That is now the only spelling. - there were separate sets of counters for the 2 reporting routines. You had to change "5" in both places. - there were races incrementing the separate counters in the SMP case in sched_ithd(). - sched_ithd() used printf() but everything else uses log(). log() is better, but it is even less safe to call in (effectively) fast interrupt handler context than is printf(). -stable may have this bug in a different form -- "stray" interrupts may be missing interrupt masking. The nmi handler has it since it _is_ missing interrupt masking. - isa_strayintr() has better worded messages than sched_ithd(). intr_execute_handlers() is in between. % * Break the warnings by installing parallel port hardware that uses % irq 7 and the PPP driver for it (this happens on most systems), lpt? ppbus? % and install an ide drive or other hardware that uses irq 15 and a % suitable driver for it. Using irq 15 for ata1 probably happens on most systems now. Using lpt to eat irq7's doesn't break the warning so well now, since lpt causes the warning (perhaps since it was new-bused, or at least since current was i-threaded). Also, eating the warnings in lpt depends on its (mis)implementation details. ppbus wants to multiplex the irq between different drivers. I think it does this by leaving the irq attached and switching it around as required (actually more than required), but it should leave the irq unattached and attach it as required. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040225115747.O9312>