From owner-freebsd-i386@FreeBSD.ORG Mon Aug 29 11:59:07 2005 Return-Path: X-Original-To: freebsd-i386@freebsd.org Delivered-To: freebsd-i386@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 49FD116A41F; Mon, 29 Aug 2005 11:59:07 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6CF2A43D46; Mon, 29 Aug 2005 11:59:06 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87]) by mailout1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j7TBwZAD014848; Mon, 29 Aug 2005 21:58:35 +1000 Received: from epsplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j7TBwW0g010677; Mon, 29 Aug 2005 21:58:33 +1000 Date: Mon, 29 Aug 2005 21:58:32 +1000 (EST) From: Bruce Evans X-X-Sender: bde@epsplex.bde.org To: Chuck Ebbert <76306.1226@compuserve.com> In-Reply-To: <200508281759.j7SHx6Yo072283@www.freebsd.org> Message-ID: <20050829202410.Q1317@epsplex.bde.org> References: <200508281759.j7SHx6Yo072283@www.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-gnats-submit@freebsd.org, freebsd-i386@freebsd.org Subject: Re: i386/85417: Possible bug in ia32 floating-point exception handler X-BeenThere: freebsd-i386@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: I386-specific issues for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Aug 2005 11:59:07 -0000 On Sun, 28 Aug 2005, Chuck Ebbert wrote: >> Description: > src/sys/i386/i386/trap.c 1.277.2.1 lines 303-308: > 303: case T_ARITHTRAP: /* arithmetic trap */ > 304: #ifdef DEV_NPX > 305: ucode = npxtrap(); > 306: if (ucode == -1) > 307: goto userout; > 308: #else > > src/sys/i386/isa/npx.c:npxtrap() > can never return -1, so SIGFPE code 0 will be sent to the > user app if no unmasked i387 exception bits are set. This > can happen on some non-Intel processors. Does it happen often or normally for some non-Intel processors? I don't have any processors with the problem, but noticed it somehow, perhaps in connection with the kernel bugs that resulted in the one IRQ13 from npx_probe() not being discarded. This IRQ13 normally occurs for all processors and is supposed to be ignored, but was counted as a /stray/spurious IRQ13. I vaguely remember that it didn't cause a spurious npxtrap() but I saw the bug by wondering what would happen if it did. >> How-To-Repeat: > >> Fix: > > (1) If the current behavior is correct, remove the if statement > at line 306 since it has no effect and only confuses reviewers. > > (2) Otherwise, change fpetable[] in src/sys/i386/isa/npx.c > so entry 0 is -1 instead of 0 I have used (2) (and some other related changes) for many years, but don't remember having noticed that the check in trap() is so broken. I now think it should have been "if (ucode == 0)". But -1 is better for a special value since ucode is not a bitmap. >From my version of npx.c: % Index: npx.c % =================================================================== % RCS file: /home/ncvs/src/sys/i386/isa/npx.c,v % retrieving revision 1.152 % diff -u -2 -r1.152 npx.c % --- npx.c 19 Jun 2004 22:24:16 -0000 1.152 % +++ npx.c 15 Oct 2004 15:38:14 -0000 % @@ -578,5 +612,5 @@ % */ % static char fpetable[128] = { % - 0, % + -1, /* 0 - no unmasked exception (probably bogus IRQ13) */ % FPE_FLTINV, /* 1 - INV */ % FPE_FLTUND, /* 2 - DNML */ This is (2). % @@ -642,5 +676,5 @@ % FPE_FLTDIV, /* 3E - DNML | DZ | OFL | UFL | IMP */ % FPE_FLTINV, /* 3F - INV | DNML | DZ | OFL | UFL | IMP */ % - FPE_FLTSUB, /* 40 - STK */ % + -1, /* 40 - STK, but no unmasked exception so no trap */ % FPE_FLTSUB, /* 41 - INV | STK */ % FPE_FLTUND, /* 42 - DNML | STK */ We also shouldn't signal stack-only exceptions to userland, since there is no way to mask these exceptions. Normally there is no problem since npx traps shouldn't occur when all maskable exceptions are masked, but if an npx trap somehow occurs it may say the STK bit set because there is no mask for this bit; thus a spurious npx trap signaled userland with a bogus code of FPE_FLTSUB instead a bogus code of 0 depending on whether STK is set. % @@ -751,7 +794,16 @@ % } % % + /* Ignore some spurious traps. */ % + if ((status & ~control & 0x3f) == 0) { % + intr_restore(saveintr); % + return (-1); % + } We also shouldn't do normal trap processing for spurious traps. This change makes the above changes just style fixes since it makes fpetable not used for entries 0 and 3F. % + % +#if 0 % + /* XXX this clobbers the status. */ % if (PCPU_GET(fpcurthread) == curthread) % fnclex(); This is an incomplete fix for clobbering of the npx status word for normal traps, and happens to make the previous change unnecessary since it results in "normal trap processing" not changing anything in the npx state. The exception status word used to be saved separately and gdb used to display it, but both the saving and the display have been lost. New signal handlers now get a clean npx state, so the hack of clearing the exceptions shouldn't be needed, but when the signal handling was fixed, only the infrastructure for the hack (i.e., preserving the exception status word) was removed. Old (FreeBSD-[1-3] and FreeBSD-4 compatible) signal handlers still need the hack. % - intr_restore(savecrit); % +#endif % + intr_restore(saveintr); This churns the name of the "save" variable back to one that matches the code. % return (fpetable[status & ((~control & 0x3f) | 0x40)]); % } Bruce