From owner-freebsd-stable@FreeBSD.ORG Tue Nov 20 20:40:03 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1328C1C8 for ; Tue, 20 Nov 2012 20:40:03 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) by mx1.freebsd.org (Postfix) with ESMTP id 81DF78FC08 for ; Tue, 20 Nov 2012 20:40:02 +0000 (UTC) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id CB2E35CAFE for ; Tue, 20 Nov 2012 21:31:59 +0100 (CET) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id Hd5MDpbXrmO1 for ; Tue, 20 Nov 2012 21:31:57 +0100 (CET) Received: from mail.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id E2F195CAFD for ; Tue, 20 Nov 2012 21:31:57 +0100 (CET) Received: from bsdmhs.longwitz (unknown [192.168.99.6]) by mail.incore (Postfix) with ESMTP id DE94A5083F for ; Tue, 20 Nov 2012 21:31:56 +0100 (CET) Message-ID: <50ABE8BC.1010904@incore.de> Date: Tue, 20 Nov 2012 21:31:56 +0100 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: freebsd-stable@freebsd.org Subject: page fault on verbose boot Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Nov 2012 20:40:03 -0000 One of my servers goes to page fault (only) on verbose boot. The backtrace looks a little like the one given in lists.freebsd.org/pipermail/freebsd-stable/2010-December/060704.html, therefore I append the information requested there. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.3-STABLE #3: Mon Sep 24 11:29:54 CEST 2012 root@dsspbx1.incore:/usr/obj/usr/src/sys/SERVER i386 Preloaded elf kernel "/boot/kernel/kernel" at 0xc0c41000. Preloaded elf module "/boot/modules/i4b.ko" at 0xc0c41188. Preloaded elf module "/boot/kernel/sppp.ko" at 0xc0c41234. Timecounter "i8254" frequency 1193182 Hz quality 0 Calibrating TSC clock ... TSC clock: 999721588 Hz CPU: Intel Pentium III (999.72-MHz 686-class CPU) Origin="GenuineIntel" Id=0x68a Family = 6 Model = 8 Stepping = 10 Features=0x387fbff FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR ,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE Instruction TLB: 4 KB pages, 4-way set associative, 32 entries Instruction TLB: 4 MB pages, fully associative, 2 entries Data TLB: 4 KB pages, 4-way set associative, 64 entries 2nd-level cache: 256 KB, 8-way set associative, 32 byte line size 1st-level instruction cache: 16 KB, 4-way set associative, 32 byte line size Data TLB: 4 MB Pages, 4-way set associative, 8 entries 1st-level data cache: 16 KB, 4-way set associative, 32 byte line size real memory = 1074790400 (1025 MB) Physical memory chunk(s): 0x0000000000001000 - 0x000000000009efff, 647168 bytes (158 pages) 0x0000000000100000 - 0x00000000003fffff, 3145728 bytes (768 pages) 0x0000000001026000 - 0x000000003eda5fff, 1037565952 bytes (253312 pages) avail memory = 1036435456 (988 MB) Table 'FACP' at 0x3ffffafa Table 'APIC' at 0x3ffffb6e APIC: Found table at 0x3ffffb6e MP Configuration Table version 1.4 found at 0xc009f560 APIC: Using the MADT enumerator MADT: Found CPU APIC ID 0 ACPI ID 0: enabled SMP: Added CPU 0 (AP) MADT: Found CPU APIC ID 3 ACPI ID 1: enabled SMP: Added CPU 3 (AP) ACPI APIC Table: INTR: Adding local APIC 0 as a target FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 2 package(s) x 1 core(s) cpu0 (BSP): APIC ID: 3 cpu1 (AP): APIC ID: 0 bios32: Found BIOS32 Service Directory header at 0xc00f6990 bios32: Entry = 0xfd85e (c00fd85e) Rev = 0 Len = 1 pcibios: PCI BIOS entry at 0xfd7c0+0x397 pnpbios: Found PnP BIOS data at 0xc00f69c0 pnpbios: Entry = f0000:a934 Rev = 1.0 Other BIOS signatures found: x86bios: IVT 0x000000-0x0004ff at 0xc0000000 x86bios: SSEG 0x010000-0x01ffff at 0xc49c4000 x86bios: EBDA 0x09f000-0x09ffff at 0xc009f000 x86bios: ROM 0x0a0000-0x0effff at 0xc00a0000 APIC: CPU 0 has ACPI ID 1 APIC: CPU 1 has ACPI ID 0. ULE: setup cpu 0 ULE: setup cpu 1 ACPI: RSDP 0xf6910 00014 (v00 INTEL ) ACPI: RSDT 0x3fffa25c 00030 (v01 INTEL 024B 00000001 PTL 00000000) ACPI: FACP 0x3ffffafa 00074 (v01 INTEL 024B 00000001 PTL 00000000) ACPI: DSDT 0x3fffa28c 0586E (v01 INTEL 024B 00000001 MSFT 0100000A) ACPI: FACS 0x3fffffc0 00040 ACPI: APIC 0x3ffffb6e 0006A (v01 INTEL 024B 00000001 PTL 00000000) ACPI: BOOT 0x3ffffbd8 00028 (v01 INTEL 024B 00000001 PTL 00000000) MADT: Found IO APIC ID 4, Interrupt 0 at 0xfec00000 ioapic0: Routing external 8259A's -> intpin 0 MADT: Found IO APIC ID 5, Interrupt 16 at 0xfec01000 MADT: Interrupt override: source 9, irq 31 ioapic0: intpin 9 disabled lapic0: Routing NMI -> LINT1 lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic3: Routing NMI -> LINT1 lapic3: LINT1 trigger: edge lapic3: LINT1 polarity: high ioapic0 irqs 0-15 on motherboard ioapic1 irqs 16-31 on motherboard cpu0 BSP: ID: 0x03000000 VER: 0x00040011 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000100ef therm: 0x00000000 err: 0x000000f0 pmc: 0x00010400 fslock: pseudo-device null: random: io: mem: Pentium Pro MTRR support enabled netsmb_dev: loaded CPU0: local APIC error 0x80 acpi0: on motherboard acpi0: Overriding SCI Interrupt from IRQ 9 to IRQ 31 ioapic1: routing intpin 15 (PCI IRQ 31) to lapic 3 vector 48 acpi0: [MPSAFE] acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: wakeup code va 0xc49be000 pa 0x1000 pci_open(1): mode 1 addr port (0x0cf8) is 0x80015864 pci_open(1a): mode1res=0x80000000 (0x80000000) pci_cfgcheck: device 0 [class=060000] [hdr=80] is there (id=00091166) pcibios: BIOS version 2.10 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x404-0x407 on acpi0 cpu0: on acpi0 cpu0: switching to generic Cx mode cpu1: on acpi0 acpi_ec0: port 0xca6,0xca7 on acpi0 pci_link0: Index IRQ Rtd Ref IRQs Initial Probe 0 255 N 0 5 10 Validation 0 255 N 0 5 10 After Disable 0 255 N 0 5 10 pci_link1: Index IRQ Rtd Ref IRQs Initial Probe 0 14 N 0 14 Validation 0 14 N 0 14 After Disable 0 255 N 0 14 ... ioapic1: routing intpin 2 (PCI IRQ 18) to lapic 3 vector 49 ioapic1: routing intpin 2 (PCI IRQ 18) to lapic 3 vector 49 ioapic1: routing intpin 7 (PCI IRQ 23) to lapic 3 vector 51 ioapic1: routing intpin 8 (PCI IRQ 24) to lapic 3 vector 52 ioapic1: routing intpin 9 (PCI IRQ 25) to lapic 3 vector 53 ioapic0: routing intpin 14 (ISA IRQ 14) to lapic 3 vector 54 ioapic1: routing intpin 4 (PCI IRQ 20) to lapic 3 vector 55 ioapic1: routing intpin 5 (PCI IRQ 21) to lapic 3 vector 56 ioapic0: Changing trigger for pin 8 to level ioapic0: Changing polarity for pin 8 to low ioapic0: routing intpin 4 (ISA IRQ 4) to lapic 3 vector 57 ioapic0: routing intpin 3 (ISA IRQ 3) to lapic 3 vector 58 ioapic0: routing intpin 6 (ISA IRQ 6) to lapic 3 vector 59 ioapic0: routing intpin 1 (ISA IRQ 1) to lapic 3 vector 60 ioapic0: routing intpin 12 (ISA IRQ 12) to lapic 3 vector 61 lapic: Divisor 2, Frequency 66648108 Hz Timecounter "TSC" frequency 999721588 Hz quality -100 Timecounters tick every 1.000 msec ... SMP: AP CPU #1 Launched! cpu1 AP: ID: 0x00000000 VER: 0x00040011 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000200ef therm: 0x00000000 err: 0x000000f0 pmc: 0x00010400 ioapic0: routing intpin 3 ( ISA IRQ 3) to lapic 0 vector 48 CPU1: local APIC error 0x80 flowtable cleaner started ioapic0: routing intpin 6 (ISA IRQ 6) to lapic 0 vector 49 ioapic0: routing intpin 14 (ISA IRQ 14) to lapic 0 vector 50 ioapic1: routing intpin 4 (PCI IRQ 20) to lapic 0 vector 51 ioapic1: routing intpin 7 (PCI IRQ 23) to lapic 0 vector 52 ioapic1: routing intpin 9 (PCI IRQ 25) to lapic 0 vector 53 ioapic1: routing intpin 15 (PCI IRQ 31) to lapic 0 vector 54 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 03 fault virtual address = 0xf000e2c3 fault code = supervisor write, page not present instruction pointer = 0x20:0xc08e8e15 stack pointer = 0x28:0xc1020c78 frame pointer = 0x28:0xc1020c90 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 0 (swapper) [thread pid 0 tid 100000 ] Stopped at intr_execute_handlers+0x15: addl $0x1,0(%eax) db> call doadump Cannot dump. Device not defined or unavailable. db> panic panic: from debugger cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper(c0984233,c04e4943,1,c098203e,c1020980,...) at db_trace_self_wrapper+0x26 kdb_backtrace(c09a2e37,0,c0958ccd,c10209cc,0,...) at kdb_backtrace+0x2a panic(c0958ccd,c1020a90,c04e3881,c08e8e15,0,...) at panic+0x15c db_panic(c08e8e15,0,ffffffff,c1020a08,1,...) at db_panic+0x17 db_command(c0958d7c,c1020af0,c04e592d,c09a132b,c08f2ee3,...) at db_command+0x381 db_command_loop(c09a132b,c08f2ee3,fb,0,0,...) at db_command_loop+0x5a db_trap(c,0,1,246,2,...) at db_trap+0xdd kdb_trap(c,0,c1020c38,1,1,...) at kdb_trap+0xa8 trap_fatal(c17dc000,f000e000,2,0,c,...) at trap_fatal+0x2df trap_pfault(c09a3805,c,c1020bb8,c08ee8e0,c0a350a0,...) at trap_pfault+0x2de trap(c1020c38) at trap+0x3f3 calltrap() at calltrap+0x6 --- trap 0xc, eip = 0xc08e8e15, esp = 0xc1020c78, ebp = 0xc1020c90 --- intr_execute_handlers(0,c1020cb4,3,c1020cf8,c08e4625,...) at intr_execute_handlers+0x15 lapic_handle_intr(36,c1020cb4) at lapic_handle_intr+0x4c Xapic_isr1() at Xapic_isr1+0x35 --- interrupt, eip = 0xc08ee8fb, esp = 0xc1020cf4, ebp = 0xc1020cf8 --- spinlock_exit(c09a1e2e,0,36,3,c1020d38,...) at spinlock_exit+0x2b ioapic_assign_cpu(c4d1565c,0,0,0,c08f3d29,...) at ioapic_assign_cpu+0x2b0 intr_shuffle_irqs(0,101ec00,101ec00,101e000,1025000,...) at intr_shuffle_irqs+0xba mi_startup() at mi_startup+0xac begin() at begin+0x2c --------------------- >From running kernel (normal boot) using kgdb: (kgdb) l *intr_execute_handlers+0x15 0xc08e8e15 is in intr_execute_handlers (/usr/src/sys/i386/i386/intr_machdep.c:234). 229 * We count software interrupts when we process them. The 230 * code here follows previous practice, but there's an 231 * argument for counting hardware interrupts when they're 232 * processed too. 233 */ 234 (*isrc->is_count)++; 235 PCPU_INC(cnt.v_intr); 236 237 ie = isrc->is_event; 238 (kgdb) l *ioapic_assign_cpu+0x2b0 0xc08ea3f0 is in ioapic_assign_cpu (/usr/src/sys/i386/i386/io_apic.c:385). 380 381 /* 382 * Free the old vector after the new one is established. This is done 383 * to prevent races where we could miss an interrupt. 384 */ 385 if (old_vector) { 386 if (isrc->is_handlers > 0) 387 apic_disable_vector(old_id, old_vector); 388 apic_free_vector(old_id, old_vector, intpin->io_irq); 389 } (kgdb) quit Maybe there is an interrupt problem with this server, because $PIR is broken (has size 0): PIRTOOL (c) 2002-2006 Bruce M. Simpson --------------------------------------------- PCI Interrupt Routing Table at 0x000FDF10 ----------------------------------------- 0x00: Signature: $PIR 0x04: Version: 1.0 0x06: Size: 0 bytes (268435454 entries) 0x08: Device: 255:31:7 0x0a: PCI Exclusive IRQs: none 0x0c: Compatible with: 0x00000000 unknown chipset 0x10: Miniport Data: 0x00000000 0x1f: Checksum: 0x00 Entry Location Bus Device Pin Link IRQs Otherwise the (shortened) output of mptable looks good: =============================================== MPTable ----------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000f6900 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0x42 mode: Virtual Wire ------------------------------------------------ MP Config Table Header: physical address: 0x0009f560 signature: 'PCMP' base table length: 292 version: 1.4 checksum: 0xbf OEM ID: 'INTEL ' Product ID: 'STL2 ' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 28 local APIC address: 0xfee00000 extended table length: 260 extended table checksum: 251 -------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 3 0x11 BSP, usable 6 8 10 0x387fbff 0 0x11 AP, usable 6 8 10 0x387fbff -- Bus: Bus ID Type 0 PCI 1 PCI 2 ISA -- I/O APICs: APIC ID Version State Address 4 0x11 usable 0xfec00000 5 0x11 usable 0xfec01000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT active-hi edge 2 0 4 0 INT active-hi edge 2 1 4 1 INT active-hi edge 2 3 4 3 INT active-hi edge 2 4 4 4 INT active-lo level 0 6:A 5 10 INT active-hi edge 2 6 4 6 INT active-hi edge 2 7 4 7 INT active-hi edge 2 8 4 8 INT active-lo level 0 7:A 5 7 INT active-lo level 0 3:A 5 2 INT active-lo level 0 2:A 5 3 INT active-hi edge 2 12 4 12 INT active-hi edge 2 13 4 13 INT active-hi edge 2 14 4 14 INT active-hi edge 2 15 4 15 INT active-lo level 0 8:A 5 8 INT active-lo level 0 9:A 5 9 INT active-lo level 1 10:A 5 4 INT active-lo level 1 11:A 5 5 -- Local Ints:Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT active-hi edge 2 0 255 0 NMI active-hi edge 0 0:A 255 1 I can easy reproduce this problem, hints for ddb commands suitable for debugging are welcome. -- Andreas Longwitz