From owner-freebsd-current@FreeBSD.ORG Sun May 30 04:44:40 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2AED716A4CE for ; Sun, 30 May 2004 04:44:40 -0700 (PDT) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6B1BE43D41 for ; Sun, 30 May 2004 04:44:39 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i4UBib5v001614; Sun, 30 May 2004 21:44:37 +1000 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i4UBiYLS016725; Sun, 30 May 2004 21:44:35 +1000 Date: Sun, 30 May 2004 21:44:34 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Kris Kennaway In-Reply-To: <20040530043049.GA16224@xor.obsecurity.org> Message-ID: <20040530155728.S979@gamplex.bde.org> References: <20040530043049.GA16224@xor.obsecurity.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@freebsd.org Subject: Re: stray irq13 at runtime X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 May 2004 11:44:40 -0000 On Sat, 29 May 2004, Kris Kennaway wrote: > Since updating the i386 package machines the other day, they've all > experienced the following: > > May 29 21:24:53 gohan28 kernel: stray irq13 > > irq13: npx0 2 0 > stray irq13 1 0 > > This is not appearing during boot - those machines have been up for > hours before the interrupt occurs. This is probably harmless. There's some bug in APIC mode that causes a stray irq13 to be delivered earlier on my systems. Perhaps you are getting this same stray irq13 delivered later. You also have an extra non-stray irq13. There should be exactly 1 irq13 delivered ever, except on 386 and 486SX systems applications can generate any number. I debugged some of this. APIC mode seems to behave differently because: (1) the APIC responds much more slowly than the PIC (after 2349 instead of 57 iterations in the enclosed debugging code on an Athlon XP2600) (2) the not-so-new interrupt code broke the hack that prevented getting interrupts after bus_teardown_intr(). These are reported as stray interrupts. There was a completely different bug (non-atomic update of the interrupt name and/or count pointers) which caused non-stray npx (and possibly other, but always for npx) interrupts to be reported as stray, so the hack hasn't helped for a year or two if it ever did. npx_probe() tests whether exceptions are reported by traps or interrupts by causing an unmasked exception and checking whether this causes a trap or interrupt. Normally when there's a trap there is an interrupt too. Traps occur synchronously, but interrupts occur asynchronously, especially since we don't synchronize with the FPU^WNPX. We do an fnop after dividing by 0 to trigger reporting the exception. The NPX and CPU continue asynchronosly. Thus we have a race. The size of the race window is apparently related to [A]PIC hardware, so it has become large enough relative to CPU speeds to cause problems on fast CPUs with high-latency [A]PICs. OTOH, we can easily synchronize better using fwait instead of fnop. The reasons for using fnop instead of fwait (only FUD?) don't seem to apply any more. Changing from fnop to fwait gets the interrupt delivered after 49 iterations instead of 2349 in the enclosed debugging code). This is still much longer than I'd like. 49 iterations is still over 100 cycles, and there are hundreds more cycles between the fwait and the delivery of the irq13 for trap and interrupt handling. Something must wait for irq13 delivery so that irq13's don't get seen by the wrong thread (if they are used at all), but other parts of npx.c don't even know if they might have to wait. Fixes and debugging code: % Index: npx.c % =================================================================== % RCS file: /home/ncvs/src/sys/i386/isa/npx.c,v % retrieving revision 1.148 % diff -u -2 -r1.148 npx.c % --- npx.c 11 May 2004 20:14:53 -0000 1.148 % +++ npx.c 30 May 2004 10:39:19 -0000 % @@ -105,5 +105,5 @@ % #define fnstcw(addr) __asm __volatile("fnstcw %0" : "=m" (*(addr))) % #define fnstsw(addr) __asm __volatile("fnstsw %0" : "=m" (*(addr))) % -#define fp_divide_by_0() __asm("fldz; fld1; fdiv %st,%st(1); fnop") % +#define fp_divide_by_0() __asm("fldz; fld1; fdiv %st,%st(1); fwait") % #define frstor(addr) __asm("frstor %0" : : "m" (*(addr))) % #ifdef CPU_ENABLE_SSE This changes from fnop to fwait, to synchronize better. See above. % @@ -369,4 +369,19 @@ % npx_traps_while_probing = npx_intrs_while_probing = 0; % fp_divide_by_0(); % +#ifdef DEBUG % + { % + int i; % + % + for (i = 0; i < 10000000; i++) % + if (npx_intrs_while_probing != 0) { % + device_printf(dev, % + "saw intr after %d iterations\n", % + i); % + break; % + } % + } This determines latency of irq13 delivery. % +#else % + DELAY(1000); /* wait for any IRQ13 */ % +#endif Waiting this long should always work. % if (npx_traps_while_probing != 0) { % /* % @@ -407,4 +422,5 @@ % bus_teardown_intr(dev, irq_res, irq_cookie); % % +#if 0 % /* % * XXX hack around brokenness of bus_teardown_intr(). If we left the % @@ -417,4 +433,5 @@ % isrc->is_pic->pic_disable_source(isrc); % } % +#endif bus_teardown_intr() still doesn't disable the interrupt, at least in the edge-triggered case, but neither does this hack (in either the PIC or APIC case), since isrc->is_pic->pic_disable_source() is a no-op for edge-triggered interrupts and irq13 is normally edge-triggered. % % bus_release_resource(dev, SYS_RES_IRQ, irq_rid, irq_res); I haven't figured out why the APIC case normally delivers both a normal (fast) interrupt and stray interrupt when we don't wait for the one interrupt that actually occurs. One is counted as stray because it occurs after the bus_teardown_intr(), but both of them seem to occur after that. So there seems to be a race or double counting somewhere. Bruce