Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Feb 2004 17:42:51 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Eric van Gyzen <vangyzen@stat.duke.edu>
Cc:        freebsd-current@freebsd.org
Subject:   Re: panic: arithmetic trap in fpurstor() in sys/i386/isa/npx.c
Message-ID:  <20040220171712.D4279@gamplex.bde.org>
In-Reply-To: <200402191542.54594.vangyzen@stat.duke.edu>
References:  <200402191542.54594.vangyzen@stat.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 19 Feb 2004, Eric van Gyzen wrote:

> I can reliably panic 5.2-RELEASE GENERIC running on three different AMD Athlon
> CPUs with:
>
>   # echo 'q()' | R --no-save
>
> R is ports/math/R-letter, and q() just tells R to quit.  This does not happen
> on an AthlonMP or P3 running the same kernel.  It did not happen on the same
> three Athlon machines while running 5.1-RELEASE.  Some simple gdb debugging
> follows.  If you need more info, please ask; I don't debug the kernel very
> often, so I'm not sure what to provide.  :-/

Try backing out rev.1.216 of vm_machdep.c.  I don't see exactly how this
commit could cause the problem, but it is the only related thing that has
changed since 5.1, and the first part of it has several bugs (it is a
layering violation and is missing explicit disabling of interrupts).

> panic: arithmetic trap
> ...
> (kgdb) list *0xc07e07b4
> 0xc07e07b4 is in fpurstor (/usr/src/sys/i386/isa/npx.c:986).
> [snip]
>
> (kgdb) list 976,987
> 976     static void
> 977     fpurstor(addr)
> 978             union savefpu *addr;
> 979     {
> 980
> 981     #ifdef CPU_ENABLE_SSE
> 982             if (cpu_fxsr)
> 983                     fxrstor(addr);
> 984             else
> 985     #endif
> 986                     frstor(addr);
> 987     }

frstror() can only cause an arithmetic trap on broken CPUs.  I doubt
that Athlons are that broken, so this trap is mysterious.  frstor()
doesn't even trap for plain i386's; it may cause a bogus IRQ13 which
the kernel has to be careful not to turn into an arithmetic trap.

Please report the value and contents of addr (about 108 bytes of it
in hex).

> (kgdb) bt
> #0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
> #1  0xc0631967 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:372
> #2  0xc0631cde in panic () at /usr/src/sys/kern/kern_shutdown.c:550
> #3  0xc07db60c in trap_fatal (frame=0xd8a08c88, eva=0)
>     at /usr/src/sys/i386/i386/trap.c:821
> #4  0xc07db062 in trap (frame=
>       {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = 22,
>        tf_ebp = -660566840, tf_isp = -660566860, tf_ebx = 582, tf_edx = 0,
>        tf_ecx = 134996160, tf_eax = -660566560, tf_trapno = 6, tf_err = 0,
>        tf_eip = -1065482316, tf_cs = 8, tf_eflags = 65606,
>        tf_esp = -660566792, tf_ss = -1065482847})
>     at /usr/src/sys/i386/i386/trap.c:618
> #5  0xc07c8258 in calltrap () at {standard input}:94
> #6  0xc07e05a1 in npxdna () at /usr/src/sys/i386/isa/npx.c:840

Everything seems notmal up to the trap.  Old versions of gdb don't
print the frame before calltrap(), but you found it anyway.  npxdna()
is supposed to just load the user npx context and return.  There may
be an unmasked arithmetic trap pending in the user context, but that
is rare too.  fpurstor() must not trap since otherwise it would be
impossible to load user npx contexts in the kernel without breaking
trap delivery timing.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040220171712.D4279>