From owner-freebsd-smp Thu Sep 4 13:21:26 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id NAA21040 for smp-outgoing; Thu, 4 Sep 1997 13:21:26 -0700 (PDT) Received: from Ilsa.StevesCafe.com (Ilsa.StevesCafe.com [205.168.119.129]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id NAA21028 for ; Thu, 4 Sep 1997 13:21:19 -0700 (PDT) Received: from Ilsa.StevesCafe.com (localhost [127.0.0.1]) by Ilsa.StevesCafe.com (8.8.7/8.8.5) with ESMTP id OAA10238; Thu, 4 Sep 1997 14:21:05 -0600 (MDT) Message-Id: <199709042021.OAA10238@Ilsa.StevesCafe.com> X-Mailer: exmh version 2.0gamma 1/27/96 From: Steve Passe To: "John S. Dyson" cc: smp@FreeBSD.ORG Subject: Re: 3.0/SMP panic In-reply-to: Your message of "Thu, 04 Sep 1997 12:31:52 CDT." <199709041731.MAA01880@dyson.iquest.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 04 Sep 1997 14:21:05 -0600 Sender: owner-freebsd-smp@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Hi, We have gotten most SMP systems running now, one recent hurdle was lkms that got out of sync with the kernel proper. The symptom was panic during boot, or possibly when a screensaver lkm activated. ipfw_mod was also shown to be a problem. The solution is to sup current source for the lkms, rebuild & install them. We still have at least one fundimental bug affecting a small number of systems: Fatal trap 12 during boot with -current. This bug has so far only been seen under SMP (is this true?). It appears to be very dependant on the specific system configuration. The following is a roundup of reports from various users. Unless your working on this problem you probably don't want to read further. --- Kenneth Merry : > By any chance do you have more than 64MB in your machine and >options MAXMEM=... in your kernel config file? > > I did, and I had panics very much like that (in pmap_enter) >immediately on boot. When I took the MAXMEM line out (I've got 128MB), >things worked just fine... I'm still not sure why, though. > ... > I found the problem. At first I suspected the sound driver, but > the problem really turned out to be: > > options "MAXMEM=(128*1024)" --- Jaye Mathisen : I was using M$ Inetload 2.0 to simulate a bunch of mail users. IT was running fine for a few minutes, then died horribly with: Fatal trap 12: page fault while in kernel mode cpuid = 1 lapic.id = 33554432 current process = Idle mp_lock = 01000003 interrupt mask = net tty bio <- SMP: XXX Stopped at _pmap_enter+0xa7: and some other stuff. The traceback is not too long, but I don't have any good way to type it all in. It goes like: _pmap_enter _vm_fault Trap_pfault _trap _zalloc _pmap_insert_entry _pmap_enter _kmem_alloc _in_pcballoc _Tcp_attach _tcp_usr_attack _sonewconn _tcp_inut _ip_input _ipintr swi_net_next It is trivially reproducible at least on my hardware. --- Akira Watanabe : The kernel (suped yesterday) causes a panic. Fatal trap 18: integer divide fault while in kernel mode cpuid = 0 lapic.id = 16777216 instruction pointer = 0x8:0xf01bc794 stack pointer = 0x10:0xf4cabc84 frame pointer = 0x10:0xf4cabcd0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 236 (ftpd) mp_lock = 00000003 interrupt mask = <- SMP: XXX trap number = 18 panic: integer divide fault cpuid 0 boot() called on cpu#0 syncing disks... 11 11 8 2 done Here is a stack trace. # gdb -k kernel /var/crash/vmcore.0 GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.16 (i386-unknown-freebsd), Copyright 1996 Free Software Foundation, Inc... IdlePTD 24d000 current pcb at 1f9608 panic: integer divide fault #0 boot (howto=256) at ../../kern/kern_shutdown.c:289 289 dumppcb.pcb_cr3 = rcr3(); (kgdb) where #0 boot (howto=256) at ../../kern/kern_shutdown.c:289 #1 0xf0118e36 in panic (fmt=0xf01ccbea "integer divide fault") at ../../kern/kern_shutdown.c:416 #2 0xf01cd86f in trap_fatal (frame=0xf4cabc48) at ../../i386/i386/trap.c:806 #3 0xf01cd072 in trap (frame={tf_es = -256049136, tf_ds = 131088, tf_edi = -256677376, tf_esi = 0, tf_ebp = -188039984, tf_isp = -188040080, tf_ebx = 0, tf_edx = 0, tf_ecx = 4096, tf_eax = 4096, tf_trapno = 18, tf_err = 0, tf_eip = -266614892, tf_cs = 8, tf_eflags = 66118, tf_esp = 0, tf_ss = 3}) at ../../i386/i386/trap.c:487 #4 0xf01bc794 in vnode_pager_haspage (object=0xf0bdb800, pindex=0, before=0xf4cabd34, after=0xf4cabd30) at ../../vm/vnode_pager.c:231 #5 0xf01bbcff in vm_pager_has_page (object=0xf0bdb800, offset=0, before=0xf4cabd34, after=0xf4cabd30) at ../../vm/vm_pager.c:205 #6 0xf01b2e05 in vm_fault_additional_pages (m=0xf04ef7c4, rbehind=3, rahead=4, marray=0xf4cabdd0, reqpage=0xf4cabda4) at ../../vm/vm_fault.c:1100 #7 0xf01b21c0 in vm_fault (map=0xf0bd9300, vaddr=134385664, fault_type=1 '\001', fault_flags=0) at ../../vm/vm_fault.c:414 #8 0xf01cd23a in trap_pfault (frame=0xf4cabe50, usermode=0) at ../../i386/i386/trap.c:681 #9 0xf01ccf47 in trap (frame={tf_es = 134348816, tf_ds = 134348816, tf_edi = -259133440, tf_esi = 134385664, tf_ebp = -188039496, tf_isp = -188039560, tf_ebx = 2048, tf_edx = 134387712, tf_ecx = 512, tf_eax = -188047360, tf_trapno = 12, tf_err = 0, tf_eip = -266551371, tf_cs = 8, tf_eflags = 66054, tf_esp = -188039368, tf_ss = -188039376}) at ../../i386/i386/trap.c:339 #10 0xf01cbfb5 in generic_copyin () #11 0xf012cf6f in sosend (so=0xf0bddc00, addr=0x0, uio=0xf4cabf38, top=0x0, control=0x0, flags=0, p=0xf0bb9600) at ../../kern/uipc_socket.c:449 #12 0xf0122ec8 in soo_write (fp=0xf0bdcd40, uio=0xf4cabf38, cred=0xf0bd8b00) at ../../kern/sys_socket.c:78 #13 0xf0120884 in write (p=0xf0bb9600, uap=0xf4cabf94, retval=0xf4cabf84) at ../../kern/sys_generic.c:268 #14 0xf01cdacb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 134385664, tf_esi = 2256, tf_ebp = -272642012, tf_isp = -188039196, tf_ebx = 6, tf_edx = 5, tf_ecx = 1, tf_eax = 4, tf_trapno = 22, tf_err = 7, tf_eip = 135028641, tf_cs = 31, tf_eflags = 531, tf_esp = -272642064, tf_ss = 39}) at ../../i386/i386/trap.c:953 #15 0x80c5fa1 in ?? () #16 0x3658 in ?? () #17 0x86fb in ?? () #18 0x2045 in ?? () #19 0x1096 in ?? () (kgdb) --- Hajimu UMEMOTO : Sept 1: > Yes, ipfw_mod and linux_mod were loaded. According to your > suggestion, I disabled loading ipfw_mod and reboot. Then, the kernel > was boot without any problem. :-) Sept 4: > I built lkms during `make world'. I wish to try that method, but... > I've tried with the kernel cvsuped at Sep 3 and Sep 4. Although no > lkm module is loaded, when accessing network, the kernel causes panic > frequently. The UP kernel seems to have no problem. I'm using vx > driver for 3C905. --- Tom Bartol : Over the last several days starting from world/kernel of 8/28 and even on world/kernel as of last night (9/2) I get crashes that haven't left me with any useful info. All the crashes have occured while composing e-mail from within pine. My /var/mail is an NFSv3 mounted fs served by an Auspex NS-7000 over 100/BT (nice!). The system in question is a Dell XPS-P133c (i.e. P5/133) with 128MB, Adaptec 2940U, and 3Com 3C595 100/BT. I've been running the same world/kernel on my home system with no trouble (but no NFS or network card either). Curiously, I composed this e-mail on the unstable system with no trouble. All the crashes consistently occured while composing mail within a few minutes after logging in. --- From: randyd To: smp@csn.net Subject: SMP / LKM update Greetings, Just a quick update... I cvsupped new code at about 7:30 CST yesterday and did "make cleandepend && make world". This AM I built a fresh SMP kernel and rebooted the machine. I didn't start an X session though, I waited for the 'daemon' screen saver to "kick in". When it did, I got a screen full of... Oops I'm on cpu#1, I need to be on Cpu#0 -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD