Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Nov 2012 17:14:33 +0200
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Andreas Longwitz <longwitz@incore.de>
Cc:        freebsd-stable@FreeBSD.org
Subject:   Re: page fault on verbose boot
Message-ID:  <50B8CD59.1050308@FreeBSD.org>
In-Reply-To: <50ABE8BC.1010904@incore.de>
References:  <50ABE8BC.1010904@incore.de>

next in thread | previous in thread | raw e-mail | index | archive | help
on 20/11/2012 22:31 Andreas Longwitz said the following:
> One of my servers goes to page fault (only) on verbose boot. The
> backtrace looks a little like the one given in
> 
>  lists.freebsd.org/pipermail/freebsd-stable/2010-December/060704.html,
> 
> therefore I append the information requested there.
> 
> 
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>         The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 8.3-STABLE #3: Mon Sep 24 11:29:54 CEST 2012
>     root@dsspbx1.incore:/usr/obj/usr/src/sys/SERVER i386
> Preloaded elf kernel "/boot/kernel/kernel" at 0xc0c41000.
> Preloaded elf module "/boot/modules/i4b.ko" at 0xc0c41188.
> Preloaded elf module "/boot/kernel/sppp.ko" at 0xc0c41234.
> Timecounter "i8254" frequency 1193182 Hz quality 0
> Calibrating TSC clock ... TSC clock: 999721588 Hz
> CPU: Intel Pentium III (999.72-MHz 686-class CPU)
>   Origin="GenuineIntel"  Id=0x68a  Family = 6  Model = 8  Stepping = 10
>   Features=0x387fbff FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR
>  ,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE
> Instruction TLB: 4 KB pages, 4-way set associative, 32 entries
> Instruction TLB: 4 MB pages, fully associative, 2 entries
> Data TLB: 4 KB pages, 4-way set associative, 64 entries
> 2nd-level cache: 256 KB, 8-way set associative, 32 byte line size
> 1st-level instruction cache: 16 KB, 4-way set associative, 32 byte line size
> Data TLB: 4 MB Pages, 4-way set associative, 8 entries
> 1st-level data cache: 16 KB, 4-way set associative, 32 byte line size
> real memory  = 1074790400 (1025 MB)
> Physical memory chunk(s):
> 0x0000000000001000 - 0x000000000009efff, 647168 bytes (158 pages)
> 0x0000000000100000 - 0x00000000003fffff, 3145728 bytes (768 pages)
> 0x0000000001026000 - 0x000000003eda5fff, 1037565952 bytes (253312 pages)
> avail memory = 1036435456 (988 MB)
> Table 'FACP' at 0x3ffffafa
> Table 'APIC' at 0x3ffffb6e
> APIC: Found table at 0x3ffffb6e
> MP Configuration Table version 1.4 found at 0xc009f560
> APIC: Using the MADT enumerator
> MADT: Found CPU APIC ID 0 ACPI ID 0: enabled
> SMP: Added CPU 0 (AP)
> MADT: Found CPU APIC ID 3 ACPI ID 1: enabled
> SMP: Added CPU 3 (AP)
> ACPI APIC Table: <INTEL  024B    >
> INTR: Adding local APIC 0 as a target
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> FreeBSD/SMP: 2 package(s) x 1 core(s)
>  cpu0 (BSP): APIC ID:  3
>  cpu1 (AP): APIC ID:  0
> bios32: Found BIOS32 Service Directory header at 0xc00f6990
> bios32: Entry = 0xfd85e (c00fd85e)  Rev = 0  Len = 1
> pcibios: PCI BIOS entry at 0xfd7c0+0x397
> pnpbios: Found PnP BIOS data at 0xc00f69c0
> pnpbios: Entry = f0000:a934  Rev = 1.0
> Other BIOS signatures found:
> x86bios:   IVT 0x000000-0x0004ff at 0xc0000000
> x86bios:  SSEG 0x010000-0x01ffff at 0xc49c4000
> x86bios:  EBDA 0x09f000-0x09ffff at 0xc009f000
> x86bios:   ROM 0x0a0000-0x0effff at 0xc00a0000
> APIC: CPU 0 has ACPI ID 1
> APIC: CPU 1 has ACPI ID 0.
> ULE: setup cpu 0
> ULE: setup cpu 1
> ACPI: RSDP 0xf6910 00014 (v00 INTEL )
> ACPI: RSDT 0x3fffa25c 00030 (v01 INTEL  024B     00000001 PTL  00000000)
> ACPI: FACP 0x3ffffafa 00074 (v01 INTEL  024B     00000001 PTL  00000000)
> ACPI: DSDT 0x3fffa28c 0586E (v01 INTEL  024B     00000001 MSFT 0100000A)
> ACPI: FACS 0x3fffffc0 00040
> ACPI: APIC 0x3ffffb6e 0006A (v01 INTEL  024B     00000001 PTL  00000000)
> ACPI: BOOT 0x3ffffbd8 00028 (v01 INTEL  024B     00000001 PTL  00000000)
> MADT: Found IO APIC ID 4, Interrupt 0 at 0xfec00000
> ioapic0: Routing external 8259A's -> intpin 0
> MADT: Found IO APIC ID 5, Interrupt 16 at 0xfec01000
> MADT: Interrupt override: source 9, irq 31
> ioapic0: intpin 9 disabled
> lapic0: Routing NMI -> LINT1
> lapic0: LINT1 trigger: edge
> lapic0: LINT1 polarity: high
> lapic3: Routing NMI -> LINT1
> lapic3: LINT1 trigger: edge
> lapic3: LINT1 polarity: high
> ioapic0 <Version 1.1> irqs 0-15 on motherboard
> ioapic1 <Version 1.1> irqs 16-31 on motherboard
> cpu0 BSP:
>      ID: 0x03000000   VER: 0x00040011 LDR: 0x00000000 DFR: 0xffffffff
>   lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff
>   timer: 0x000100ef therm: 0x00000000 err: 0x000000f0 pmc: 0x00010400
> fslock: pseudo-device
> null: <null device, zero device>
> random: <entropy source, Software, Yarrow>
> io: <I/O>
> mem: <memory>
> Pentium Pro MTRR support enabled
> netsmb_dev: loaded
> CPU0: local APIC error 0x80
> acpi0: <INTEL 024B> on motherboard
> acpi0: Overriding SCI Interrupt from IRQ 9 to IRQ 31
> ioapic1: routing intpin 15 (PCI IRQ 31) to lapic 3 vector 48
> acpi0: [MPSAFE]
> acpi0: [ITHREAD]
> acpi0: Power Button (fixed)
> acpi0: wakeup code va 0xc49be000 pa 0x1000
> pci_open(1):    mode 1 addr port (0x0cf8) is 0x80015864
> pci_open(1a):   mode1res=0x80000000 (0x80000000)
> pci_cfgcheck:   device 0 [class=060000] [hdr=80] is there (id=00091166)
> pcibios: BIOS version 2.10
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x404-0x407 on acpi0
> cpu0: <ACPI CPU> on acpi0
> cpu0: switching to generic Cx mode
> cpu1: <ACPI CPU> on acpi0
> acpi_ec0: <Embedded Controller: GPE 0x4> port 0xca6,0xca7 on acpi0
> pci_link0:        Index  IRQ  Rtd  Ref  IRQs
>   Initial Probe       0  255   N     0  5 10
>   Validation          0  255   N     0  5 10
>   After Disable       0  255   N     0  5 10
> pci_link1:        Index  IRQ  Rtd  Ref  IRQs
>   Initial Probe       0   14   N     0  14
>   Validation          0   14   N     0  14
>   After Disable       0  255   N     0  14
> ...
> ioapic1: routing intpin 2 (PCI IRQ 18) to lapic 3 vector 49
> ioapic1: routing intpin 2 (PCI IRQ 18) to lapic 3 vector 49
> ioapic1: routing intpin 7 (PCI IRQ 23) to lapic 3 vector 51
> ioapic1: routing intpin 8 (PCI IRQ 24) to lapic 3 vector 52
> ioapic1: routing intpin 9 (PCI IRQ 25) to lapic 3 vector 53
> ioapic0: routing intpin 14 (ISA IRQ 14) to lapic 3 vector 54
> ioapic1: routing intpin 4 (PCI IRQ 20) to lapic 3 vector 55
> ioapic1: routing intpin 5 (PCI IRQ 21) to lapic 3 vector 56
> ioapic0: Changing trigger for pin 8 to level
> ioapic0: Changing polarity for pin 8 to low
> ioapic0: routing intpin 4 (ISA IRQ 4) to lapic 3 vector 57
> ioapic0: routing intpin 3 (ISA IRQ 3) to lapic 3 vector 58
> ioapic0: routing intpin 6 (ISA IRQ 6) to lapic 3 vector 59
> ioapic0: routing intpin 1 (ISA IRQ 1) to lapic 3 vector 60
> ioapic0: routing intpin 12 (ISA IRQ 12) to lapic 3 vector 61
> lapic: Divisor 2, Frequency 66648108 Hz
> Timecounter "TSC" frequency 999721588 Hz quality -100
> Timecounters tick every 1.000 msec
> ...
> SMP: AP CPU #1 Launched!
> cpu1 AP:
>      ID: 0x00000000   VER: 0x00040011 LDR: 0x00000000 DFR: 0xffffffff
>   lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff
>   timer: 0x000200ef therm: 0x00000000 err: 0x000000f0 pmc: 0x00010400
> ioapic0: routing intpin 3 ( ISA IRQ 3) to lapic 0 vector 48
> CPU1: local APIC error 0x80
> flowtable cleaner started
> ioapic0: routing intpin 6 (ISA IRQ 6) to lapic 0 vector 49
> ioapic0: routing intpin 14 (ISA IRQ 14) to lapic 0 vector 50
> ioapic1: routing intpin 4 (PCI IRQ 20) to lapic 0 vector 51
> ioapic1: routing intpin 7 (PCI IRQ 23) to lapic 0 vector 52
> ioapic1: routing intpin 9 (PCI IRQ 25) to lapic 0 vector 53
> ioapic1: routing intpin 15 (PCI IRQ 31) to lapic 0 vector 54
> kernel trap 12 with interrupts disabled
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 03
> fault virtual address   = 0xf000e2c3
> fault code              = supervisor write, page not present
> instruction pointer     = 0x20:0xc08e8e15
> stack pointer           = 0x28:0xc1020c78
> frame pointer           = 0x28:0xc1020c90
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = resume, IOPL = 0
> current process         = 0 (swapper)
> [thread pid 0 tid 100000 ]
> Stopped at      intr_execute_handlers+0x15:     addl    $0x1,0(%eax)
> db> call doadump
> Cannot dump. Device not defined or unavailable.
> db> panic
> panic: from debugger
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper(c0984233,c04e4943,1,c098203e,c1020980,...) at
> db_trace_self_wrapper+0x26
> kdb_backtrace(c09a2e37,0,c0958ccd,c10209cc,0,...) at kdb_backtrace+0x2a
> panic(c0958ccd,c1020a90,c04e3881,c08e8e15,0,...) at panic+0x15c
> db_panic(c08e8e15,0,ffffffff,c1020a08,1,...) at db_panic+0x17
> db_command(c0958d7c,c1020af0,c04e592d,c09a132b,c08f2ee3,...) at
> db_command+0x381
> db_command_loop(c09a132b,c08f2ee3,fb,0,0,...) at db_command_loop+0x5a
> db_trap(c,0,1,246,2,...) at db_trap+0xdd
> kdb_trap(c,0,c1020c38,1,1,...) at kdb_trap+0xa8
> trap_fatal(c17dc000,f000e000,2,0,c,...) at trap_fatal+0x2df
> trap_pfault(c09a3805,c,c1020bb8,c08ee8e0,c0a350a0,...) at trap_pfault+0x2de
> trap(c1020c38) at trap+0x3f3
> calltrap() at calltrap+0x6
> --- trap 0xc, eip = 0xc08e8e15, esp = 0xc1020c78, ebp = 0xc1020c90 ---
> intr_execute_handlers(0,c1020cb4,3,c1020cf8,c08e4625,...) at
> intr_execute_handlers+0x15
> lapic_handle_intr(36,c1020cb4) at lapic_handle_intr+0x4c
> Xapic_isr1() at Xapic_isr1+0x35
> --- interrupt, eip = 0xc08ee8fb, esp = 0xc1020cf4, ebp = 0xc1020cf8 ---
> spinlock_exit(c09a1e2e,0,36,3,c1020d38,...) at spinlock_exit+0x2b
> ioapic_assign_cpu(c4d1565c,0,0,0,c08f3d29,...) at ioapic_assign_cpu+0x2b0
> intr_shuffle_irqs(0,101ec00,101ec00,101e000,1025000,...) at
> intr_shuffle_irqs+0xba
> mi_startup() at mi_startup+0xac
> begin() at begin+0x2c
> 
> ---------------------
> 
> From running kernel (normal boot) using kgdb:
> 
> (kgdb) l *intr_execute_handlers+0x15
> 0xc08e8e15 is in intr_execute_handlers
> (/usr/src/sys/i386/i386/intr_machdep.c:234).
> 229            * We count software interrupts when we process them.  The
> 230            * code here follows previous practice, but there's an
> 231            * argument for counting hardware interrupts when they're
> 232            * processed too.
> 233            */
> 234           (*isrc->is_count)++;
> 235           PCPU_INC(cnt.v_intr);
> 236
> 237           ie = isrc->is_event;
> 238
> (kgdb) l *ioapic_assign_cpu+0x2b0
> 0xc08ea3f0 is in ioapic_assign_cpu (/usr/src/sys/i386/i386/io_apic.c:385).
> 380
> 381           /*
> 382            * Free the old vector after the new one is established.
> This is done
> 383            * to prevent races where we could miss an interrupt.
> 384            */
> 385            if (old_vector) {
> 386                if (isrc->is_handlers > 0)
> 387                        apic_disable_vector(old_id, old_vector);
> 388                apic_free_vector(old_id, old_vector, intpin->io_irq);
> 389            }
> (kgdb) quit
[snip]
> I can easy reproduce this problem, hints for ddb commands suitable for
> debugging are welcome.
> 

Could you please execute the following commands?

In kgdb (if you have exactly the same kernel, or otherwise with a new offset from
a new panic):
disassemble intr_execute_handlers+0x15

In ddb:
bt
show apic
show idt
show intrcnt
show lapic
x/ax interrupt_sources,32

Thank you.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50B8CD59.1050308>