Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Jun 2004 15:59:55 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Simon Barner <barner@in.tum.de>
Cc:        current@freebsd.org
Subject:   Re: Bogus signal handler causes kernel panic (5.2.1-p8/i386)
Message-ID:  <20040619152924.F3372@gamplex.bde.org>
In-Reply-To: <20040618134944.GC1049@zi025.glhnet.mhn.de>
References:  <20040616105706.GC1140@zi025.glhnet.mhn.de> <20040617134101.V1345@gamplex.bde.org> <20040618134944.GC1049@zi025.glhnet.mhn.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 18 Jun 2004, Simon Barner wrote:

> first of all thanks a lot for your comprehensive replys. I tried both of
> your patched with the following results:
>
> - patch 1 ("the quick & dirty one"): The panic is gone, the program is its
>   infinite loop printing lots of '.'s and some '*'s, without any
>   recognizable pattern and consumes 100% cpu, but nothing bad happens.
>
> - patch 2 ("the not so quick one"): My system still panics (stack trace
>   attached).

I haven't search for the cause of this yet.  Apparently there is another
path to fpurstor() that leaves exceptions pending.

>   Additionally, I see the following messages on my console (e.g. when I
>   run `script' (but only as root IIRC, I can examine this further if you
>   need this information):
>
> Jun 18 14:56:09 zi025 kernel: kernel trap 22 with interrupts disabled
> Jun 18 14:56:09 zi025 kernel: npxdna: fpcurthread == curthread 1 times
>                                                               ^^^
>                                             this counter is increasing

I found a problem with my patch and think it is the same one that
causes this npxdna message.  On thinking about it, it also causes yet
another path to fpurstor() that leaves exceptions pending (one triggered
by the fnclex to clear the exceptions).  The patch depends on fpcurthread
!= NULL implying that npx accesses won't cause an npxdna, but
exec_setregs() does some foot-shooting so that this is not true when
npxdrop() is called via fpstate_drop() from exec_setregs().  Things
usually work, but there is a race from enabling interrupts for the
npxdna trap, and if there is an unmasked pending exception then npxdna()
will trap fatally for attempting to restore the npx state so than
fnclex can defuse it.

Try this patch:

%%%
Index: machdep.c
===================================================================
RCS file: /home/ncvs/src/sys/i386/i386/machdep.c,v
retrieving revision 1.590
diff -u -2 -r1.590 machdep.c
--- machdep.c	11 Jun 2004 11:16:22 -0000	1.590
+++ machdep.c	19 Jun 2004 05:27:18 -0000
@@ -1134,4 +1134,7 @@
         }

+	/* XXX drop the FP state correctly, unlike in the next 3 statements. */
+	fpstate_drop(td);
+
 	/*
 	 * Initialize the math emulator (if any) for the current process.
%%%

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040619152924.F3372>