From owner-freebsd-stable@FreeBSD.ORG Wed Oct 21 18:01:12 2009 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 247DE106566B for ; Wed, 21 Oct 2009 18:01:12 +0000 (UTC) (envelope-from alson+ml@alm.flutnet.org) Received: from tafi.alm.flutnet.org (tafi.dsl.alm.flutnet.org [145.99.245.99]) by mx1.freebsd.org (Postfix) with ESMTP id 2DD818FC17 for ; Wed, 21 Oct 2009 18:01:10 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by tafi.alm.flutnet.org (Postfix) with ESMTP id 1AE4C78C1D for ; Wed, 21 Oct 2009 20:01:09 +0200 (CEST) X-Virus-Scanned: amavisd-new at alm.flutnet.org Received: from tafi.alm.flutnet.org ([127.0.0.1]) by localhost (tafi.alm.flutnet.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kVGUr4TAUxjE for ; Wed, 21 Oct 2009 20:00:59 +0200 (CEST) Received: by tafi.alm.flutnet.org (Postfix, from userid 1000) id 26F5578C1C; Wed, 21 Oct 2009 20:00:59 +0200 (CEST) Date: Wed, 21 Oct 2009 20:00:59 +0200 From: Alson van der Meulen To: freebsd-stable@FreeBSD.org Message-ID: <20091021180059.GB1847@tafi.alm.flutnet.org> Mail-Followup-To: freebsd-stable@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Cc: Subject: fatal trap 12 in em1 taskq after upgrade to 7.2-REL X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Oct 2009 18:01:12 -0000 Hello, I recently upgraded a server from RELENG_6_4 to RELENG_7_2 (both i386). Since then, the box has started crashing regularly. The longest uptime since the upgrade is 1d5h, the shortest is 2m35s. The backtrace is always similar: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x400 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0adc79e stack pointer = 0x28:0xe585fc10 frame pointer = 0x28:0xe585fc24 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 34 (em1 taskq) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper(c0ba8989,e585faac,c07f2ab9,c0bca1e0,0,...) at db_trace_self_wrapper+0x26 kdb_backtrace(c0bca1e0,0,c0b61aee,e585fab8,0,...) at kdb_backtrace+0x29 panic(c0b61aee,c0bcb510,c56294dc,1,1,...) at panic+0x119 trap_fatal(c0cd3ec0,0,1,0,14,...) at trap_fatal+0x333 trap_pfault(0,20111ac,0,0,c56292b8,...) at trap_pfault+0x270 trap(e585fbd0) at trap+0x3fa calltrap() at calltrap+0x6 --- trap 0xc, eip = 0xc0adc79e, esp = 0xe585fc10, ebp = 0xe585fc24 --- _bus_dmamap_sync(c5615400,400,2,c5612690,0,...) at _bus_dmamap_sync+0xe em_rxeof(246,c5612690,e585fca4,c5612690,e585fc9c,...) at em_rxeof+0x14a em_handle_rxtx(c5635000,1,c5615380,c561539c,c0ba076a,...) at em_handle_rxtx+0x27 taskqueue_run(c5615380,c561539c,c0ba076a,0,e585fcf4,...) at taskqueue_run+0x175 taskqueue_thread_loop(c563935c,e585fd38,36626334,64636435,39396362,...) at taskqueue_thread_loop+0xc8 fork_exit(c08285a0,c563935c,e585fd38) at fork_exit+0x99 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xe585fd70, ebp = 0 --- Uptime: 28m55s Physical memory: 2027 MB Dumping 194 MB: 179 163 147 131 115 99 83 67 51 35 19 3 Dump complete Automatic reboot in 15 seconds - press a key on the console to abort em1: watchdog timeout -- resetting Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x400 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0adc79e stack pointer = 0x28:0xc53e0c00 frame pointer = 0x28:0xc53e0c14 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 14 (swi4: clock sio) trap number = 12 Rebooting... The second fatal trap output is usually corrupted in the serial console output. This box has been running 6.4 (and 6.3 and so on) without crashes for at least a year, so a sudden hardware failure seems unlikely. It's an Intel entry level server mainboard with a P4 (with HT) and dual onboard em(4). The server is currently mostly idle, especially em1. Dmesg: Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.2-RELEASE-p4 #0: Tue Oct 20 04:15:46 CEST 2009 root@eraser.waalsdorp.nl:/usr/obj/usr/src/sys/ERASER Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (2992.71-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf43 Stepping = 3 Features=0xbfebfbff Features2=0x649d AMD Features=0x20000000 Logical CPUs per core: 2 real memory = 2138984448 (2039 MB) avail memory = 2083450880 (1986 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP/HT): APIC ID: 1 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard ioapic2 irqs 48-71 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 500, 10 (4) failed acpi0: reservation of 560, 20 (4) failed acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, 7f700000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 vgapci0: port 0xcb80-0xcb87 mem 0xdfd00000-0xdfd7ffff,0xc0000000-0xcfffffff,0xdfd80000-0xdfdbffff irq 16 at device 2.0 on pci0 agp0: on vgapci0 agp0: detected 7932k stolen memory agp0: aperture size is 256M pcib1: irq 16 at device 28.0 on pci0 pci2: on pcib1 pcib2: at device 0.0 on pci2 pci4: on pcib2 em0: port 0xef80-0xefbf mem 0xdffe0000-0xdfffffff irq 27 at device 3.0 on pci4 em0: [FILTER] em0: Ethernet address: 00:0e:0c:4b:4b:89 pcib3: at device 0.2 on pci2 pci3: on pcib3 uhci0: port 0xcc00-0xcc1f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: on uhci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xcc80-0xcc9f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: on uhci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xcd00-0xcd1f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: on uhci2 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xdfdff800-0xdfdffbff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: on usb3 uhub3: 6 ports with 6 removable, self powered pcib4: at device 30.0 on pci0 pci1: on pcib4 puc0: port 0xdf00-0xdf1f,0xde80-0xde87,0xde00-0xde07 irq 21 at device 2.0 on pci1 puc0: [FILTER] uart0: <16550 or compatible> on puc0 uart0: [FILTER] uart1: <16550 or compatible> on puc0 uart1: [FILTER] em1: port 0xdf80-0xdfbf mem 0xdfee0000-0xdfefffff irq 18 at device 3.0 on pci1 em1: [FILTER] em1: Ethernet address: 00:0e:0c:4b:4b:88 isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376 at device 31.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] atapci1: port 0xcf80-0xcf87,0xcf00-0xcf03,0xce80-0xce87,0xce00-0xce03,0xcd80-0xcd8f mem 0xdfdffc00-0xdfdfffff irq 19 at device 31.2 on pci0 atapci1: [ITHREAD] atapci1: AHCI Version 01.00 controller with 4 ports detected ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] ata4: on atapci1 ata4: [ITHREAD] ata5: on atapci1 ata5: [ITHREAD] ichsmb0: port 0x400-0x41f irq 19 at device 31.3 on pci0 ichsmb0: [GIANT-LOCKED] ichsmb0: [ITHREAD] smbus0: on ichsmb0 smb0: on smbus0 ipmi0: on smbus0 ipmi0: SSIF mode found at address 0x42 on smbus acpi_button0: on acpi0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A, console sio0: [FILTER] cpu0: on acpi0 est0: on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr f2b00000f2b device_attach: est0 attach returned 6 p4tcc0: on cpu0 cpu1: on acpi0 est1: on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr f2b00000f2b device_attach: est1 attach returned 6 p4tcc1: on cpu1 pmtimer0 on isa0 orm0: at iomem 0xc9800-0xca7ff,0xca800-0xcb7ff,0xcb800-0xcc7ff,0xdc000-0xdffff pnpid ORM0000 on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] ppc0: parallel port not found. sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: CDRW at ata0-master UDMA33 ad4: 953869MB at ata2-master SATA150 ad6: 953869MB at ata3-master SATA150 ad8: 35304MB at ata4-master SATA150 ad10: 35304MB at ata5-master SATA150 ipmi0: IPMI device rev. 1, firmware rev. 2.81, version 1.5 ipmi0: Number of channels 0 ipmi0: Attached watchdog SMP: AP CPU #1 Launched! kernel config is GENERIC plus: device puc options ALT_BREAK_TO_DEBUGGER options DDB options KDB_TRACE options KDB_UNATTENDED device smbus device smb device ichsmb options ROUTETABLES=2 regards, Alson