Date: Thu, 1 Jul 2004 11:21:38 -0400 From: Eric van Gyzen <vangyzen@stat.duke.edu> To: Bruce Evans <bde@zeta.org.au> Cc: freebsd-current@freebsd.org Subject: Re: panic: arithmetic trap in fpurstor() in sys/i386/isa/npx.c Message-ID: <200407011121.38118.vangyzen@stat.duke.edu> In-Reply-To: <20040220171712.D4279@gamplex.bde.org> References: <200402191542.54594.vangyzen@stat.duke.edu> <20040220171712.D4279@gamplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Bruce et al.:
I apologize for reviving this old problem. It became irrelevant to me for a
few months, but now it's relevant again.
Backing out rev 1.216 of vm_machdep.c fixed the problem. I can no longer
panic these machines.
Would you still like to see the value and contents of union savefpu *addr?
Eric
Bruce Evans wrote:
> On Thu, 19 Feb 2004, Eric van Gyzen wrote:
> > I can reliably panic 5.2-RELEASE GENERIC running on three different AMD
> > Athlon CPUs with:
> >
> > # echo 'q()' | R --no-save
> >
> > R is ports/math/R-letter, and q() just tells R to quit. This does not
> > happen on an AthlonMP or P3 running the same kernel. It did not happen
> > on the same three Athlon machines while running 5.1-RELEASE. Some simple
> > gdb debugging follows. If you need more info, please ask; I don't debug
> > the kernel very often, so I'm not sure what to provide. :-/
>
> Try backing out rev.1.216 of vm_machdep.c. I don't see exactly how this
> commit could cause the problem, but it is the only related thing that has
> changed since 5.1, and the first part of it has several bugs (it is a
> layering violation and is missing explicit disabling of interrupts).
>
> > panic: arithmetic trap
> > ...
> > (kgdb) list *0xc07e07b4
> > 0xc07e07b4 is in fpurstor (/usr/src/sys/i386/isa/npx.c:986).
> > [snip]
> >
> > (kgdb) list 976,987
> > 976 static void
> > 977 fpurstor(addr)
> > 978 union savefpu *addr;
> > 979 {
> > 980
> > 981 #ifdef CPU_ENABLE_SSE
> > 982 if (cpu_fxsr)
> > 983 fxrstor(addr);
> > 984 else
> > 985 #endif
> > 986 frstor(addr);
> > 987 }
>
> frstror() can only cause an arithmetic trap on broken CPUs. I doubt
> that Athlons are that broken, so this trap is mysterious. frstor()
> doesn't even trap for plain i386's; it may cause a bogus IRQ13 which
> the kernel has to be careful not to turn into an arithmetic trap.
>
> Please report the value and contents of addr (about 108 bytes of it
> in hex).
>
> > (kgdb) bt
> > #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240
> > #1 0xc0631967 in boot (howto=256) at
> > /usr/src/sys/kern/kern_shutdown.c:372 #2 0xc0631cde in panic () at
> > /usr/src/sys/kern/kern_shutdown.c:550 #3 0xc07db60c in trap_fatal
> > (frame=0xd8a08c88, eva=0)
> > at /usr/src/sys/i386/i386/trap.c:821
> > #4 0xc07db062 in trap (frame=
> > {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = 22,
> > tf_ebp = -660566840, tf_isp = -660566860, tf_ebx = 582, tf_edx =
> > 0, tf_ecx = 134996160, tf_eax = -660566560, tf_trapno = 6, tf_err = 0,
> > tf_eip = -1065482316, tf_cs = 8, tf_eflags = 65606,
> > tf_esp = -660566792, tf_ss = -1065482847})
> > at /usr/src/sys/i386/i386/trap.c:618
> > #5 0xc07c8258 in calltrap () at {standard input}:94
> > #6 0xc07e05a1 in npxdna () at /usr/src/sys/i386/isa/npx.c:840
>
> Everything seems notmal up to the trap. Old versions of gdb don't
> print the frame before calltrap(), but you found it anyway. npxdna()
> is supposed to just load the user npx context and return. There may
> be an unmasked arithmetic trap pending in the user context, but that
> is rare too. fpurstor() must not trap since otherwise it would be
> impossible to load user npx contexts in the kernel without breaking
> trap delivery timing.
>
> Bruce
--
Eric van Gyzen Sr. Systems Programmer
http://www.stat.duke.edu/~vangyzen/ ISDS, Duke University
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200407011121.38118.vangyzen>
