From owner-freebsd-current@FreeBSD.ORG Thu Feb 19 22:42:55 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6A77716A4CE for ; Thu, 19 Feb 2004 22:42:55 -0800 (PST) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id E2F4443D1D for ; Thu, 19 Feb 2004 22:42:54 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i1K6gr5O002726; Fri, 20 Feb 2004 17:42:53 +1100 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i1K6gpch007620; Fri, 20 Feb 2004 17:42:52 +1100 Date: Fri, 20 Feb 2004 17:42:51 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Eric van Gyzen In-Reply-To: <200402191542.54594.vangyzen@stat.duke.edu> Message-ID: <20040220171712.D4279@gamplex.bde.org> References: <200402191542.54594.vangyzen@stat.duke.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@freebsd.org Subject: Re: panic: arithmetic trap in fpurstor() in sys/i386/isa/npx.c X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Feb 2004 06:42:55 -0000 On Thu, 19 Feb 2004, Eric van Gyzen wrote: > I can reliably panic 5.2-RELEASE GENERIC running on three different AMD Athlon > CPUs with: > > # echo 'q()' | R --no-save > > R is ports/math/R-letter, and q() just tells R to quit. This does not happen > on an AthlonMP or P3 running the same kernel. It did not happen on the same > three Athlon machines while running 5.1-RELEASE. Some simple gdb debugging > follows. If you need more info, please ask; I don't debug the kernel very > often, so I'm not sure what to provide. :-/ Try backing out rev.1.216 of vm_machdep.c. I don't see exactly how this commit could cause the problem, but it is the only related thing that has changed since 5.1, and the first part of it has several bugs (it is a layering violation and is missing explicit disabling of interrupts). > panic: arithmetic trap > ... > (kgdb) list *0xc07e07b4 > 0xc07e07b4 is in fpurstor (/usr/src/sys/i386/isa/npx.c:986). > [snip] > > (kgdb) list 976,987 > 976 static void > 977 fpurstor(addr) > 978 union savefpu *addr; > 979 { > 980 > 981 #ifdef CPU_ENABLE_SSE > 982 if (cpu_fxsr) > 983 fxrstor(addr); > 984 else > 985 #endif > 986 frstor(addr); > 987 } frstror() can only cause an arithmetic trap on broken CPUs. I doubt that Athlons are that broken, so this trap is mysterious. frstor() doesn't even trap for plain i386's; it may cause a bogus IRQ13 which the kernel has to be careful not to turn into an arithmetic trap. Please report the value and contents of addr (about 108 bytes of it in hex). > (kgdb) bt > #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 > #1 0xc0631967 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:372 > #2 0xc0631cde in panic () at /usr/src/sys/kern/kern_shutdown.c:550 > #3 0xc07db60c in trap_fatal (frame=0xd8a08c88, eva=0) > at /usr/src/sys/i386/i386/trap.c:821 > #4 0xc07db062 in trap (frame= > {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = 22, > tf_ebp = -660566840, tf_isp = -660566860, tf_ebx = 582, tf_edx = 0, > tf_ecx = 134996160, tf_eax = -660566560, tf_trapno = 6, tf_err = 0, > tf_eip = -1065482316, tf_cs = 8, tf_eflags = 65606, > tf_esp = -660566792, tf_ss = -1065482847}) > at /usr/src/sys/i386/i386/trap.c:618 > #5 0xc07c8258 in calltrap () at {standard input}:94 > #6 0xc07e05a1 in npxdna () at /usr/src/sys/i386/isa/npx.c:840 Everything seems notmal up to the trap. Old versions of gdb don't print the frame before calltrap(), but you found it anyway. npxdna() is supposed to just load the user npx context and return. There may be an unmasked arithmetic trap pending in the user context, but that is rare too. fpurstor() must not trap since otherwise it would be impossible to load user npx contexts in the kernel without breaking trap delivery timing. Bruce