From owner-freebsd-current Mon Mar 19 12:36: 4 2001 Delivered-To: freebsd-current@freebsd.org Received: from meow.osd.bsdi.com (meow.osd.bsdi.com [204.216.28.88]) by hub.freebsd.org (Postfix) with ESMTP id D04A537B71C for ; Mon, 19 Mar 2001 12:35:59 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: from laptop.baldwin.cx (john@jhb-laptop.osd.bsdi.com [204.216.28.241]) by meow.osd.bsdi.com (8.11.2/8.11.2) with ESMTP id f2JKZIG64428; Mon, 19 Mar 2001 12:35:18 -0800 (PST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: Date: Mon, 19 Mar 2001 12:34:55 -0800 (PST) From: John Baldwin To: Dag-Erling Smorgrav Subject: RE: Here's another one for you... Cc: current@FreeBSD.org Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On 19-Mar-01 Dag-Erling Smorgrav wrote: > SMP box with a bleeding-edge -CURRENT kernel, patched to avoid the > i586_bzero() problem: > > panic: mutex_enter: recursion on non-recursive mutex process lock @ > ../../i386/i386/trap.c:854 > cpuid = 1; lapic.id = 01000000 > Debugger("panic") That's a later symptom of a problem. We recursed on the proc lock doing the PHOLD before we handled the page fault. > CPU1 stopping CPUs: 0x00000001... stopped. > Stopped at Debugger+0x45: pushl %ebx > db> show mutex > "panic" (0xc030b1e0) locked at ../../kern/kern_shutdown.c:544 > "process lock" (0xd3f15000) locked at ../../i386/i386/machdep.c:625 This is in sendsig(): p = curproc; PROC_LOCK(p); psp = p->p_sigacts; if (SIGISMEMBER(psp->ps_osigset, sig)) { ... > "Giant" (0xc0309ac0) locked at ../../i386/i386/trap.c:1169 > db> trace > Debugger(c027d5e1) at Debugger+0x45 > panic(c027c420,c027a154,c02997d0,356,d3f14ee0) at panic+0x144 > witness_enter(d3f15000,0,c02997d0,356) at witness_enter+0x355 > trap_pfault(d7345d4c,0,0) at trap_pfault+0x143 > trap(18,10,10,d7345fa8,0) at trap+0x978 > calltrap() at calltrap+0x5 > --- trap 0xc, eip = 0, esp = 0xd7345d8c, ebp = 0xd7345ed8 --- > (null)(805c3e0,e,d7345f10,0,4) at 0 > postsig(e) at postsig+0x40b Hmmm. An eip of 0 is bad. This could be just another instance of the bzero bug just in another place. You probably want to change the code that actually sets *bzero to i586_bzero (and same for any other ops that use floating point). The code in question for this lies in i386/isa/npx.c. It seems we use the fp regs for copyin/copyout and bcopy as well. I would just change line 458 of npx.c to say '#ifdef I586_CPU_XXX' for now as your temporary patch (then you don't need to patch pmap_zero_page() anymore.) -- John Baldwin -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message