From owner-freebsd-stable@FreeBSD.ORG Fri Nov 30 15:14:43 2012 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 84334CDE for ; Fri, 30 Nov 2012 15:14:43 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id A5B548FC12 for ; Fri, 30 Nov 2012 15:14:41 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA21775; Fri, 30 Nov 2012 17:14:34 +0200 (EET) (envelope-from avg@FreeBSD.org) Message-ID: <50B8CD59.1050308@FreeBSD.org> Date: Fri, 30 Nov 2012 17:14:33 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Andreas Longwitz Subject: Re: page fault on verbose boot References: <50ABE8BC.1010904@incore.de> In-Reply-To: <50ABE8BC.1010904@incore.de> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-stable@FreeBSD.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Nov 2012 15:14:43 -0000 on 20/11/2012 22:31 Andreas Longwitz said the following: > One of my servers goes to page fault (only) on verbose boot. The > backtrace looks a little like the one given in > > lists.freebsd.org/pipermail/freebsd-stable/2010-December/060704.html, > > therefore I append the information requested there. > > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 8.3-STABLE #3: Mon Sep 24 11:29:54 CEST 2012 > root@dsspbx1.incore:/usr/obj/usr/src/sys/SERVER i386 > Preloaded elf kernel "/boot/kernel/kernel" at 0xc0c41000. > Preloaded elf module "/boot/modules/i4b.ko" at 0xc0c41188. > Preloaded elf module "/boot/kernel/sppp.ko" at 0xc0c41234. > Timecounter "i8254" frequency 1193182 Hz quality 0 > Calibrating TSC clock ... TSC clock: 999721588 Hz > CPU: Intel Pentium III (999.72-MHz 686-class CPU) > Origin="GenuineIntel" Id=0x68a Family = 6 Model = 8 Stepping = 10 > Features=0x387fbff FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR > ,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE > Instruction TLB: 4 KB pages, 4-way set associative, 32 entries > Instruction TLB: 4 MB pages, fully associative, 2 entries > Data TLB: 4 KB pages, 4-way set associative, 64 entries > 2nd-level cache: 256 KB, 8-way set associative, 32 byte line size > 1st-level instruction cache: 16 KB, 4-way set associative, 32 byte line size > Data TLB: 4 MB Pages, 4-way set associative, 8 entries > 1st-level data cache: 16 KB, 4-way set associative, 32 byte line size > real memory = 1074790400 (1025 MB) > Physical memory chunk(s): > 0x0000000000001000 - 0x000000000009efff, 647168 bytes (158 pages) > 0x0000000000100000 - 0x00000000003fffff, 3145728 bytes (768 pages) > 0x0000000001026000 - 0x000000003eda5fff, 1037565952 bytes (253312 pages) > avail memory = 1036435456 (988 MB) > Table 'FACP' at 0x3ffffafa > Table 'APIC' at 0x3ffffb6e > APIC: Found table at 0x3ffffb6e > MP Configuration Table version 1.4 found at 0xc009f560 > APIC: Using the MADT enumerator > MADT: Found CPU APIC ID 0 ACPI ID 0: enabled > SMP: Added CPU 0 (AP) > MADT: Found CPU APIC ID 3 ACPI ID 1: enabled > SMP: Added CPU 3 (AP) > ACPI APIC Table: > INTR: Adding local APIC 0 as a target > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > FreeBSD/SMP: 2 package(s) x 1 core(s) > cpu0 (BSP): APIC ID: 3 > cpu1 (AP): APIC ID: 0 > bios32: Found BIOS32 Service Directory header at 0xc00f6990 > bios32: Entry = 0xfd85e (c00fd85e) Rev = 0 Len = 1 > pcibios: PCI BIOS entry at 0xfd7c0+0x397 > pnpbios: Found PnP BIOS data at 0xc00f69c0 > pnpbios: Entry = f0000:a934 Rev = 1.0 > Other BIOS signatures found: > x86bios: IVT 0x000000-0x0004ff at 0xc0000000 > x86bios: SSEG 0x010000-0x01ffff at 0xc49c4000 > x86bios: EBDA 0x09f000-0x09ffff at 0xc009f000 > x86bios: ROM 0x0a0000-0x0effff at 0xc00a0000 > APIC: CPU 0 has ACPI ID 1 > APIC: CPU 1 has ACPI ID 0. > ULE: setup cpu 0 > ULE: setup cpu 1 > ACPI: RSDP 0xf6910 00014 (v00 INTEL ) > ACPI: RSDT 0x3fffa25c 00030 (v01 INTEL 024B 00000001 PTL 00000000) > ACPI: FACP 0x3ffffafa 00074 (v01 INTEL 024B 00000001 PTL 00000000) > ACPI: DSDT 0x3fffa28c 0586E (v01 INTEL 024B 00000001 MSFT 0100000A) > ACPI: FACS 0x3fffffc0 00040 > ACPI: APIC 0x3ffffb6e 0006A (v01 INTEL 024B 00000001 PTL 00000000) > ACPI: BOOT 0x3ffffbd8 00028 (v01 INTEL 024B 00000001 PTL 00000000) > MADT: Found IO APIC ID 4, Interrupt 0 at 0xfec00000 > ioapic0: Routing external 8259A's -> intpin 0 > MADT: Found IO APIC ID 5, Interrupt 16 at 0xfec01000 > MADT: Interrupt override: source 9, irq 31 > ioapic0: intpin 9 disabled > lapic0: Routing NMI -> LINT1 > lapic0: LINT1 trigger: edge > lapic0: LINT1 polarity: high > lapic3: Routing NMI -> LINT1 > lapic3: LINT1 trigger: edge > lapic3: LINT1 polarity: high > ioapic0 irqs 0-15 on motherboard > ioapic1 irqs 16-31 on motherboard > cpu0 BSP: > ID: 0x03000000 VER: 0x00040011 LDR: 0x00000000 DFR: 0xffffffff > lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff > timer: 0x000100ef therm: 0x00000000 err: 0x000000f0 pmc: 0x00010400 > fslock: pseudo-device > null: > random: > io: > mem: > Pentium Pro MTRR support enabled > netsmb_dev: loaded > CPU0: local APIC error 0x80 > acpi0: on motherboard > acpi0: Overriding SCI Interrupt from IRQ 9 to IRQ 31 > ioapic1: routing intpin 15 (PCI IRQ 31) to lapic 3 vector 48 > acpi0: [MPSAFE] > acpi0: [ITHREAD] > acpi0: Power Button (fixed) > acpi0: wakeup code va 0xc49be000 pa 0x1000 > pci_open(1): mode 1 addr port (0x0cf8) is 0x80015864 > pci_open(1a): mode1res=0x80000000 (0x80000000) > pci_cfgcheck: device 0 [class=060000] [hdr=80] is there (id=00091166) > pcibios: BIOS version 2.10 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x404-0x407 on acpi0 > cpu0: on acpi0 > cpu0: switching to generic Cx mode > cpu1: on acpi0 > acpi_ec0: port 0xca6,0xca7 on acpi0 > pci_link0: Index IRQ Rtd Ref IRQs > Initial Probe 0 255 N 0 5 10 > Validation 0 255 N 0 5 10 > After Disable 0 255 N 0 5 10 > pci_link1: Index IRQ Rtd Ref IRQs > Initial Probe 0 14 N 0 14 > Validation 0 14 N 0 14 > After Disable 0 255 N 0 14 > ... > ioapic1: routing intpin 2 (PCI IRQ 18) to lapic 3 vector 49 > ioapic1: routing intpin 2 (PCI IRQ 18) to lapic 3 vector 49 > ioapic1: routing intpin 7 (PCI IRQ 23) to lapic 3 vector 51 > ioapic1: routing intpin 8 (PCI IRQ 24) to lapic 3 vector 52 > ioapic1: routing intpin 9 (PCI IRQ 25) to lapic 3 vector 53 > ioapic0: routing intpin 14 (ISA IRQ 14) to lapic 3 vector 54 > ioapic1: routing intpin 4 (PCI IRQ 20) to lapic 3 vector 55 > ioapic1: routing intpin 5 (PCI IRQ 21) to lapic 3 vector 56 > ioapic0: Changing trigger for pin 8 to level > ioapic0: Changing polarity for pin 8 to low > ioapic0: routing intpin 4 (ISA IRQ 4) to lapic 3 vector 57 > ioapic0: routing intpin 3 (ISA IRQ 3) to lapic 3 vector 58 > ioapic0: routing intpin 6 (ISA IRQ 6) to lapic 3 vector 59 > ioapic0: routing intpin 1 (ISA IRQ 1) to lapic 3 vector 60 > ioapic0: routing intpin 12 (ISA IRQ 12) to lapic 3 vector 61 > lapic: Divisor 2, Frequency 66648108 Hz > Timecounter "TSC" frequency 999721588 Hz quality -100 > Timecounters tick every 1.000 msec > ... > SMP: AP CPU #1 Launched! > cpu1 AP: > ID: 0x00000000 VER: 0x00040011 LDR: 0x00000000 DFR: 0xffffffff > lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff > timer: 0x000200ef therm: 0x00000000 err: 0x000000f0 pmc: 0x00010400 > ioapic0: routing intpin 3 ( ISA IRQ 3) to lapic 0 vector 48 > CPU1: local APIC error 0x80 > flowtable cleaner started > ioapic0: routing intpin 6 (ISA IRQ 6) to lapic 0 vector 49 > ioapic0: routing intpin 14 (ISA IRQ 14) to lapic 0 vector 50 > ioapic1: routing intpin 4 (PCI IRQ 20) to lapic 0 vector 51 > ioapic1: routing intpin 7 (PCI IRQ 23) to lapic 0 vector 52 > ioapic1: routing intpin 9 (PCI IRQ 25) to lapic 0 vector 53 > ioapic1: routing intpin 15 (PCI IRQ 31) to lapic 0 vector 54 > kernel trap 12 with interrupts disabled > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 03 > fault virtual address = 0xf000e2c3 > fault code = supervisor write, page not present > instruction pointer = 0x20:0xc08e8e15 > stack pointer = 0x28:0xc1020c78 > frame pointer = 0x28:0xc1020c90 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = resume, IOPL = 0 > current process = 0 (swapper) > [thread pid 0 tid 100000 ] > Stopped at intr_execute_handlers+0x15: addl $0x1,0(%eax) > db> call doadump > Cannot dump. Device not defined or unavailable. > db> panic > panic: from debugger > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper(c0984233,c04e4943,1,c098203e,c1020980,...) at > db_trace_self_wrapper+0x26 > kdb_backtrace(c09a2e37,0,c0958ccd,c10209cc,0,...) at kdb_backtrace+0x2a > panic(c0958ccd,c1020a90,c04e3881,c08e8e15,0,...) at panic+0x15c > db_panic(c08e8e15,0,ffffffff,c1020a08,1,...) at db_panic+0x17 > db_command(c0958d7c,c1020af0,c04e592d,c09a132b,c08f2ee3,...) at > db_command+0x381 > db_command_loop(c09a132b,c08f2ee3,fb,0,0,...) at db_command_loop+0x5a > db_trap(c,0,1,246,2,...) at db_trap+0xdd > kdb_trap(c,0,c1020c38,1,1,...) at kdb_trap+0xa8 > trap_fatal(c17dc000,f000e000,2,0,c,...) at trap_fatal+0x2df > trap_pfault(c09a3805,c,c1020bb8,c08ee8e0,c0a350a0,...) at trap_pfault+0x2de > trap(c1020c38) at trap+0x3f3 > calltrap() at calltrap+0x6 > --- trap 0xc, eip = 0xc08e8e15, esp = 0xc1020c78, ebp = 0xc1020c90 --- > intr_execute_handlers(0,c1020cb4,3,c1020cf8,c08e4625,...) at > intr_execute_handlers+0x15 > lapic_handle_intr(36,c1020cb4) at lapic_handle_intr+0x4c > Xapic_isr1() at Xapic_isr1+0x35 > --- interrupt, eip = 0xc08ee8fb, esp = 0xc1020cf4, ebp = 0xc1020cf8 --- > spinlock_exit(c09a1e2e,0,36,3,c1020d38,...) at spinlock_exit+0x2b > ioapic_assign_cpu(c4d1565c,0,0,0,c08f3d29,...) at ioapic_assign_cpu+0x2b0 > intr_shuffle_irqs(0,101ec00,101ec00,101e000,1025000,...) at > intr_shuffle_irqs+0xba > mi_startup() at mi_startup+0xac > begin() at begin+0x2c > > --------------------- > > From running kernel (normal boot) using kgdb: > > (kgdb) l *intr_execute_handlers+0x15 > 0xc08e8e15 is in intr_execute_handlers > (/usr/src/sys/i386/i386/intr_machdep.c:234). > 229 * We count software interrupts when we process them. The > 230 * code here follows previous practice, but there's an > 231 * argument for counting hardware interrupts when they're > 232 * processed too. > 233 */ > 234 (*isrc->is_count)++; > 235 PCPU_INC(cnt.v_intr); > 236 > 237 ie = isrc->is_event; > 238 > (kgdb) l *ioapic_assign_cpu+0x2b0 > 0xc08ea3f0 is in ioapic_assign_cpu (/usr/src/sys/i386/i386/io_apic.c:385). > 380 > 381 /* > 382 * Free the old vector after the new one is established. > This is done > 383 * to prevent races where we could miss an interrupt. > 384 */ > 385 if (old_vector) { > 386 if (isrc->is_handlers > 0) > 387 apic_disable_vector(old_id, old_vector); > 388 apic_free_vector(old_id, old_vector, intpin->io_irq); > 389 } > (kgdb) quit [snip] > I can easy reproduce this problem, hints for ddb commands suitable for > debugging are welcome. > Could you please execute the following commands? In kgdb (if you have exactly the same kernel, or otherwise with a new offset from a new panic): disassemble intr_execute_handlers+0x15 In ddb: bt show apic show idt show intrcnt show lapic x/ax interrupt_sources,32 Thank you. -- Andriy Gapon