From owner-freebsd-stable Wed Feb 12 4:50:43 2003 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D26DF37B401 for ; Wed, 12 Feb 2003 04:50:34 -0800 (PST) Received: from smtp1.sentex.ca (smtp1.sentex.ca [199.212.134.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id A5CE943FB1 for ; Wed, 12 Feb 2003 04:50:33 -0800 (PST) (envelope-from mike@sentex.net) Received: from house.sentex.net (cage.simianscience.com [64.7.134.1]) by smtp1.sentex.ca (8.12.6/8.12.6) with ESMTP id h1CCoWFB047846 for ; Wed, 12 Feb 2003 07:50:33 -0500 (EST) (envelope-from mike@sentex.net) Message-Id: <5.2.0.9.0.20030212074706.07a847a0@192.168.0.12> X-Sender: mdtancsa@192.168.0.12 X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Wed, 12 Feb 2003 07:48:29 -0500 To: stable@freebsd.org From: Mike Tancsa Subject: Re: SMP problems post Jan 28th In-Reply-To: <5.2.0.9.0.20030211063236.06ef70a0@marble.sentex.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Same panic as the night before. Are there any hints in the panic message as to what the problem might be ? ---Mike Fatal trap 12: page fault while in kernel mode mp_lock = 01000002; cpuid = 1; lapic.id = 00000000 fault virtual address = 0x20004 fault code = supervisor read, page not present instruction pointer = 0x8:0xc0174830 stack pointer = 0x10:0xddf08c4c frame pointer = 0x10:0xddf08c58 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 32437 (find) interrupt mask = none <- SMP: XXX trap number = 12 panic: page fault mp_lock = 01000002; cpuid = 1; lapic.id = 00000000 boot() called on cpu#1 syncing disks... 4 1 done Uptime: 23h58m34s Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... cpu_reset called on cpu#1 cpu_reset: Stopping other CPUs cpu_reset: Restarting BSP cpu_reset_proxy: Grabbed mp lock cfpu_ rBeSsPet: BSP did not grab mp lock Console: serial port BIOS drive A: is disk0 BIOS drive C: is disk1 BIOS 629kB/785396kB available memory At 06:48 AM 2/11/2003 -0500, Mike Tancsa wrote: >The previous kernel had been running fine for some time, and now at night >around the running of periodic, the box will 'periodically' crash. It >does not seem to do it each night, but almost every other night and always >just after 3am when periodic runs. I can never get a crash dump, and I >had to hook up a serial console to capture this at night. Also, I cant >seem to force the issue by running periodic by hand. But, like I said, it >always seems to happen a few minutes after 3am. (No, nothing else is >scheduled to run then and no other boxes do anything to it at that time either) > > >Fatal trap 12: page fault while in kernel mode >mp_lock = 01000002; cpuid = 1; lapic.id = 00000000 >fault virtual address = 0x65b046a5 >fault code = supervisor read, page not present >instruction pointer = 0x8:0xc0174830 >stack pointer = 0x10:0xde174c4c >frame pointer = 0x10:0xde174c58 >code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 >processor eflags = interrupt enabled, resume, IOPL = 0 >current process = 4203 (find) >interrupt mask = none <- SMP: XXX >trap number = 12 >panic: page fault >mp_lock = 01000002; cpuid = 1; lapic.id = 00000000 >boot() called on cpu#1 > >syncing disks... 4 2 >done >Uptime: 1d15h8m8s >Automatic reboot in 15 seconds - press a key on the console to abort >Rebooting... >cpu_reset called on cpu#1 >cpu_reset: Stopping other CPUs >cpu_reset: Restarting BSP >tpu_reset_proxy: Grabbed mp lock cfpour_reSsPe > : BSP did not grab mp lock >Console: serial port >BIOS drive A: is disk0 >BIOS drive C: is disk1 >BIOS 629kB/785396kB available memory > > >Any ideas how best to track this down ? There seem to been some commits >on the 30th that might have had an effect. > >4.7-STABLE FreeBSD 4.7-STABLE #0: Thu Feb 6 06:04:02 EST 2003 > >ns4# dmesg >Copyright (c) 1992-2003 The FreeBSD Project. >Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. >FreeBSD 4.7-STABLE #0: Thu Feb 6 06:04:02 EST 2003 > mdtancsa@ns4.recycle.net:/usr/obj/usr/src/sys/smp >Timecounter "i8254" frequency 1193182 Hz >CPU: Intel Pentium III (801.82-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0x683 Stepping = 3 > >Features=0x383fbff >real memory = 805294080 (786420K bytes) >config> q >avail memory = 779272192 (761008K bytes) >Programming 24 pins in IOAPIC #0 >IOAPIC #0 intpin 2 -> irq 0 >IOAPIC #0 intpin 17 -> irq 11 >IOAPIC #0 intpin 18 -> irq 10 >IOAPIC #0 intpin 19 -> irq 12 >FreeBSD/SMP: Multiprocessor motherboard > cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000 > cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000 > io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 >Preloaded elf kernel "kernel" at 0xc03b5000. >Preloaded userconfig_script "/boot/kernel.conf" at 0xc03b509c. >Pentium Pro MTRR support enabled >md0: Malloc disk >Using $PIR table, 6 entries at 0xc00f0d20 >npx0: on motherboard >npx0: INT 16 interface >pcib0: on motherboard >pci0: on pcib0 >pcib1: at device 1.0 on pci0 >pci1: on pcib1 >pci1: at 0.0 >isab0: at device 4.0 on pci0 >isa0: on isab0 >atapci0: port 0xb800-0xb80f at device 4.1 >on pci0 >ata0: at 0x1f0 irq 14 on atapci0 >ata1: at 0x170 irq 15 on atapci0 >pci0: at 4.2 >Timecounter "PIIX" frequency 3579545 Hz >chip1: port 0xe800-0xe80f at >device 4.3 on pci0 >fxp0: port 0xb000-0xb03f mem >0xe3800000-0xe38fffff,0xe4000000-0xe4000fff irq 12 at device 9.0 on pci0 >fxp0: Ethernet address 00:02:b3:07:fd:8d >inphy0: on miibus0 >inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto >fxp1: port 0xa800-0xa81f mem >0xe3000000-0xe30fffff,0xe6800000-0xe6800fff irq 10 at device 10.0 on pci0 >fxp1: Ethernet address 00:a0:c9:e7:a6:e6 >inphy1: on miibus1 >inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto >twe0: <3ware Storage Controller> port 0xa400-0xa40f irq 11 at device 11.0 >on pci0 >twe0: 2 ports, Firmware FE6X 1.02.00.029, BIOS BEXX 1.07.00.009 >orm0: