Date: Thu, 23 Aug 2001 21:34:28 +0100 From: Ian Dowse <iedowse@maths.tcd.ie> To: Warner Losh <imp@harmony.village.org> Cc: simond@irrelevant.org, Andre Albsmeier <andre.albsmeier@mchp.siemens.de>, walter@pelissero.org, John Baldwin <jhb@FreeBSD.ORG>, net@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Serious i386 interrupt mask bug in RELENG_4 (was Re: 4.4-RC NFS panic) Message-ID: <200108232134.aa49928@salmon.maths.tcd.ie> In-Reply-To: Your message of "Wed, 22 Aug 2001 20:28:52 MDT." <200108230228.f7N2SqW80434@harmony.village.org>
next in thread | previous in thread | raw e-mail | index | archive | help
In message <200108230228.f7N2SqW80434@harmony.village.org>, Warner Losh writes: > >I think that might be due to a bug in the shared interrupt code that >Ian Dowse sent me about earlier today. Just to add a few details - there is a bug in the update_masks() function in i386/isa/intr_machdep.c that can cause some interrupts to occur at times when they should be masked. The problem only occurs with certain configurations of shared interrupts and devices, and this code is only present in RELENG_4. The update_masks() function is called after an interrupt handler has been registered or removed. Its main function is to update the interrupt masks (tty_imask, net_imask etc) if necessary (e.g if IRQ11 is registered by a tty-type device, IRQ11 will be added to tty_imask so that future spltty()'s will mask IRQ11). A second function of update_masks() is to update the cached copy of the interrupt mask stored with each handler for a multiplexed interrupt. This is done via the call to update_mux_masks(). The bug is that update_masks() returns without calling update_mux_masks() in some cases where it should call it. Specifically, if a newly-added multiplexed interrupt handler has the same maskptr as another handler on the same IRQ line, that new handler doesn't get it's cached mask set. For example if a single IRQ has a usb device and a modem (tty), the second device to register it's handler will get its idesc->mask set to 0 instead of the value of tty_imask because update_mux_masks() may never be called to set it. Of course, if update_masks() is called later for some other device it may correct the situation. Interrupt handlers are called with intr_mask[irq] or'd into the cpl to block further interrupts; for non-multiplexed interrupts intr_mask[irq] will set from one of the *_imask masks. However with multiplexed interrupts, only the IRQ itself (and SWI_CLOCK_MASK) are blocked, and the multiplex handler intr_mux() needs to raise the cpl further when necessary. It uses idesc->mask to control this. When this bug occurs, idesc->mask == 0, so the device interrupt handler gets called with only the IRQ and SWI_CLOCK_MASK masked, instead of the full *_mask that it requested. Not good. On my laptop, this bug causes hangs within minutes of starting to use a pccard modem, but as should be apparent from the above it could strike virtually anywhere that multiplexed interrupts are used. The patch below seems to solve the problem; it just causes update_masks() to unconditionally update the masks. Ian Index: intr_machdep.c =================================================================== RCS file: /home/iedowse/CVS/src/sys/i386/isa/intr_machdep.c,v retrieving revision 1.29.2.2 diff -u -r1.29.2.2 intr_machdep.c --- intr_machdep.c 2000/08/16 05:35:34 1.29.2.2 +++ intr_machdep.c 2001/08/23 20:24:17 @@ -651,15 +651,9 @@ if (find_idesc(maskptr, irq) == NULL) { /* no reference to this maskptr was found in this irq's chain */ - if ((*maskptr & mask) == 0) - return; - /* the irq was included in the classes mask, remove it */ *maskptr &= ~mask; } else { /* a reference to this maskptr was found in this irq's chain */ - if ((*maskptr & mask) != 0) - return; - /* put the irq into the classes mask */ *maskptr |= mask; } /* we need to update all values in the intr_mask[irq] array */ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi? <200108232134.aa49928>