From owner-freebsd-smp Thu Dec 5 19:07:08 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id TAA22837 for smp-outgoing; Thu, 5 Dec 1996 19:07:08 -0800 (PST) Received: from clem.systemsix.com (clem.systemsix.com [198.99.86.131]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id TAA22803 for ; Thu, 5 Dec 1996 19:07:03 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by clem.systemsix.com (8.6.12/8.6.12) with SMTP id UAA16913; Thu, 5 Dec 1996 20:06:51 -0700 Message-Id: <199612060306.UAA16913@clem.systemsix.com> X-Authentication-Warning: clem.systemsix.com: Host localhost didn't use HELO protocol X-Mailer: exmh version 1.6.5 12/11/95 From: Steve Passe To: smp@FreeBSD.ORG cc: Tor.Egge@idt.ntnu.no (Tor Egge), "J.M. Chuang" , Janick.Taillandier@ratp.fr (Janick TAILLANDIER), Peter Wemm Subject: last major problem In-reply-to: Your message of "Thu, 05 Dec 1996 21:52:00 -0400." <199612060152.VAA23427@bluenose.na.tuns.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 05 Dec 1996 20:06:51 -0700 Sender: owner-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi, It appears we have one last serious problem to fix. 3 users report crashing shortly after starting the 2nd CPU. -- >I tried a few hours old kernel on an ASUS P/I-P65UP5, with APIC_IO >enabled. When compiling a new kernel with two active CPUs, I got >error messages from gcc, and the compile failed. Restarting the >kernel compiling caused a trap 12, and a kernel dump. > >When looking at the kernel dump, I get > >#0 boot (howto=256) at ../../kern/kern_shutdown.c:267 >#1 0xe0112d29 in panic (fmt=0xe01bcbcf "page fault") > at ../../kern/kern_shutdown.c:395 >#2 0xe01bd8b5 in trap_fatal (frame=0xdfbffe58) at ../../i386/i386/trap.c:747 >#3 0xe01bd2e8 in trap_pfault (frame=0xdfbffe58, usermode=0) > at ../../i386/i386/trap.c:654 >#4 0xe01bcf1b in trap (frame={tf_es = -270335984, tf_ds = 16, > tf_edi = -270395292, tf_esi = -541077504, tf_ebp = -541065552, > tf_isp = -541065600, tf_ebx = 296243200, tf_edx = -4194304, > tf_ecx = -528396, tf_eax = -528396, tf_trapno = 12, tf_err = 0, > tf_eip = -535058833, tf_cs = 8, tf_eflags = 66178, tf_esp = -530083069, > tf_ss = -270296448}) at ../../i386/i386/trap.c:313 >#5 0xe01ba66f in pmap_enter (pmap=0xefe21864, va=3753889792, pa=296243200, > prot=7 '\a', wired=0) at ../../i386/i386/pmap.c:2014 >#6 0xe01a4153 in vm_fault (map=0xefe21800, vaddr=3753889792, > fault_type=3 '\003', change_wiring=0) at ../../vm/vm_fault.c:773 >#7 0xe01bd240 in trap_pfault (frame=0xdfbfffbc, usermode=1) > at ../../i386/i386/trap.c:634 >#8 0xe01bcdc3 in trap (frame={tf_es = 39, tf_ds = 39, tf_edi = 352256, > tf_esi = 330220, tf_ebp = -541074428, tf_isp = -541065244, tf_ebx = 0, > tf_edx = 1, tf_ecx = 330220, tf_eax = 0, tf_trapno = 12, tf_err = 7, > tf_eip = 45296, tf_cs = 31, tf_eflags = 66050, tf_esp = -541074452, > tf_ss = 39}) at ../../i386/i386/trap.c:241 >(kgdb) up 5 >#5 0xe01ba66f in pmap_enter (pmap=0xefe21864, va=3753889792, pa=296243200, > prot=7 '\a', wired=0) at ../../i386/i386/pmap.c:2014 >2014 origpte = *(vm_offset_t *)pte; >(kgdb) print/x pte >$2 = 0xfff7eff4 > >This indicates an attempt to dereference address 0xfff7eff4, which seems >bogus :-( > >- Tor Egge --- > I turn on DDB, here is the report from DDB when system reboots shortly after > the second CPU is launched: > > cpunumber =0 > fault virtual address = 0xfffbeff4 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xf01be11b > stack pointer = 0x8:0xefbffe94 > frame pointer = 0x10:0xefbffeb0 > code segment = base 0x0, limit 0xffff, type0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL=0 > current process = 270 (sh) > interrupt mask = > kernel: type 12 trap, code=0 > stopped at _pmap_enter+0x8f: movl 0 (%ecx), %ecx > db> trace > _pmap_enter (f0cbc464,efbfd000,da5000,7,0) at _pmap_enter+0x8ff > vm_fault(f0cbc400,efbfd000,3,0,0) at _vm_fault+0xd0b > trap_pfault(efbfffbc,1) at _trap_pfault+0xd4 > _trap(27,27,56000,509ec,efbfdb34) at _trap+0x146 > calltrap() at calltrap+0x1a > - trap 12, eip=0xb0f0, ebp=0xefbfdb34 --- > - curporc=0xf0be3c00, pie=270 --- > > > Jim --- >I have installed the latest patches and I am still having trap 12 >a few seconds after starting the second CPU. Everything seems ok >with only one CPU. >The machine is a Dell Optiplex GXpro with 2 P6 200 MHz, 32MB, >SCSI disk on Adaptec 2940UW. > ... >Stopped at _pmap_enter+0x8f: movl 0(%ecx),%ecx > >trace gives: > >_pmap_enter(f13ad064,efbfd000,17c8000,7,0) at _pmap_enter+0x8f >_vm_fault(f13ad000,efbfd000,3,0,0) at _vm_fault+xxd`b >_trap_pfault(efbfffbc,1) at _trap_pfault+0xd4 >_trap(27,27,50274,1,efbfd5F4) at _trap+0x14b >calltrap() at calltrap+0x1a >--- trap 12, eip = 0xe9fc, ebp =0xefbfd5f4 --- >--- curproc = 0xf13e9a00, pid = 1561 --- > >I have similar crash without APIC_IO or SMP_INVLTLB. > >Janick --- Tor's machine: P6, Asus P/I-P65UP5/C-P6ND, entry 5 in mptable database Jims's machine: P6, Titan Pro, entry 17 in mptable database (someone else, same MB) (I'm assumming its the Titan Pro, Tomcat II/EIDE previously reported working) Janick's machine: P6, Dell Optiplex GXPro, entry 6 in mptable database. They're all P6, ... any theories/clues??? -- Steve Passe | powered by smp@csn.net | FreeBSD