Date: Fri, 7 Jan 2011 17:22:05 -0500 From: Mark Saad <nonesuch@longcount.org> To: Garrett Cooper <gcooper@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: With out ddb and kdb set 7.3-RELEASE amd64 does not boot. Message-ID: <AANLkTimHOBRvhpXCKZo-ddYHwaZ1M%2B2T9AAXNJ81LdR0@mail.gmail.com> In-Reply-To: <AANLkTinKew-RjN_026TpO%2BsXjXHt%2BAGNxqjAPyhfOsf8@mail.gmail.com> References: <AANLkTikEmdDMsxRp8fUPOw=mXnL4TMNJ8zCkVcdvk7m0@mail.gmail.com> <AANLkTinKew-RjN_026TpO%2BsXjXHt%2BAGNxqjAPyhfOsf8@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 7, 2011 at 4:56 PM, Garrett Cooper <gcooper@freebsd.org> wrote: > On Fri, Jan 7, 2011 at 1:20 PM, Mark Saad <nonesuch@longcount.org> wrote: >> Hello hackers@, >> =C2=A0I have a good question that I cant find an answer for. I believe >> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit >> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page >> fault while in kernel mode " . The hardware works fine in 7.2-RELEASE >> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 . >> >> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the >> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this >> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC >> kernel using patches sources and tried to boot and I got the same >> crash. >> >> =C2=A0Next I rebuilt the kernel with KDB and DDB to see if I could get a >> core-dump of the system. I also set loader.conf to >> >> kernel=3D"kernel.DEBUG" >> kern.dumpdev=3D"/dev/da0s1b" >> >> Next I pxebooted =C2=A0the box and the system does not crash on boot up,= it >> will easily load a nfs root and work fine. So I copied my debug >> kernel, and loader.conf to the local disk and rebooted and it boots >> fine from the local disk . >> >> Rebooting the server and running off the local disks and debug kernel, >> I cant find any issues. >> >> Reboot the box into a GENERIC 7.3-RELEASE-p4 kernel and it crashes >> >> With this error >> >> Fatal trap 12: page fault while in kernel mode >> cpuid =3D 0; apic id =3D 00 >> fault virtual address =C2=A0 =3D 0x0 >> fault code =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D s= upervisor write data, page not present >> instruction pointer =C2=A0 =C2=A0 =3D 0x8:0xffffffff800070fa >> stack pointer =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 0x10:0xffffff= ff8153cbe0 >> frame pointer =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 0x10:0xffffff= ff8153cc50 >> code segment =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D base 0x0, limit 0xfff= ff, type 0x1b >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D DPL 0, pres 1, long 1, def32 0, gran = 1 >> processor eflags =C2=A0 =C2=A0 =C2=A0=3D interrupt enabled, resume, IOPL= =3D 0 >> current process =C2=A0 =C2=A0 =C2=A0 =3D 0 (swapper) >> [thread pid 0 tid 100000 ] >> Stopped at =C2=A0 =C2=A0 =C2=A0bzero+0xa: =C2=A0 =C2=A0 repe stosq =C2= =A0 =C2=A0 =C2=A0 %es:(%rdi) >> >> >> What do I do , has anyone else seen anything like this ? > > =C2=A0 =C2=A0What are the messages before that on the kernel console and = what > are your drivers loaded on a stable system? > Thanks, > -Garrett > Garrett The last 4 lines of the verbose boot up of the generic kernel are all from sio1 sio1: port may not be enabled sio1: irq maps: 0 0 0 0 sio1: prob failed tests(s): 4 sio1 at port 0x2f8-0x2ff irq 3 on isa0 Then the crash . No extra kernel modules are loaded Here is my pciconf hostb0@pci0:0:0:0: class=3D0x060000 card=3D0x32000e11 chip=3D0x3590808= 6 rev=3D0x0c hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'E7520 Server Memory Controller Hub' class =3D bridge subclass =3D HOST-PCI pcib1@pci0:0:2:0: class=3D0x060400 card=3D0x00000000 chip=3D0x3595808= 6 rev=3D0x0c hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D 'E752x Memory Controller Hub PCIe Port A0' class =3D bridge subclass =3D PCI-PCI pcib2@pci0:0:4:0: class=3D0x060400 card=3D0x00000000 chip=3D0x3597808= 6 rev=3D0x0c hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D 'E752x Memory Controller Hub PCIe Port B0' class =3D bridge subclass =3D PCI-PCI pcib5@pci0:0:6:0: class=3D0x060400 card=3D0x00000000 chip=3D0x3599808= 6 rev=3D0x0c hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D 'E752x Memory Controller Hub PCIe Port C0' class =3D bridge subclass =3D PCI-PCI pcib6@pci0:0:28:0: class=3D0x060400 card=3D0x00000000 chip=3D0x25ae808= 6 rev=3D0x02 hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D 'Hub Interface to PCI-X Bridge (6300ESB)' class =3D bridge subclass =3D PCI-PCI uhci0@pci0:0:29:0: class=3D0x0c0300 card=3D0x32010e11 chip=3D0x25a9808= 6 rev=3D0x02 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'USB 1.1 UHCI Controller *1 (6300ESB)' class =3D serial bus subclass =3D USB uhci1@pci0:0:29:1: class=3D0x0c0300 card=3D0x32010e11 chip=3D0x25aa808= 6 rev=3D0x02 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'USB 1.1 UHCI Controller *2 (6300ESB)' class =3D serial bus subclass =3D USB none0@pci0:0:29:4: class=3D0x088000 card=3D0x32010e11 chip=3D0x25ab808= 6 rev=3D0x02 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'Watchdog Timer (6300ESB)' class =3D base peripheral ioapic0@pci0:0:29:5: class=3D0x080020 card=3D0x32010e11 chip=3D0x25ac808= 6 rev=3D0x02 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '6300ESB I/O Advanced Programmable Interrupt Controller' class =3D base peripheral subclass =3D interrupt controller ehci0@pci0:0:29:7: class=3D0x0c0320 card=3D0x32010e11 chip=3D0x25ad808= 6 rev=3D0x02 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'USB 2.0 EHCI Controller (6300ESB)' class =3D serial bus subclass =3D USB pcib7@pci0:0:30:0: class=3D0x060400 card=3D0x00000000 chip=3D0x244e808= 6 rev=3D0x0a hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D '82801 Family (ICH2/3/4/5/6/7/8/9,63xxESB) Hub Interface to PCI Bridge' class =3D bridge subclass =3D PCI-PCI isab0@pci0:0:31:0: class=3D0x060100 card=3D0x00000000 chip=3D0x25a1808= 6 rev=3D0x02 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '6300ESB LPC Inteface Controller' class =3D bridge subclass =3D PCI-ISA atapci0@pci0:0:31:1: class=3D0x01018a card=3D0x32010e11 chip=3D0x25a2808= 6 rev=3D0x02 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'IDE Controller (6300ESB)' class =3D mass storage subclass =3D ATA pcib3@pci0:6:0:0: class=3D0x060400 card=3D0x00000000 chip=3D0x0329808= 6 rev=3D0x09 hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D 'PCI Express-to-PCI Express Bridge A (6700PXH)' class =3D bridge subclass =3D PCI-PCI pcib4@pci0:6:0:2: class=3D0x060400 card=3D0x00000000 chip=3D0x032a808= 6 rev=3D0x09 hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D 'PCI Express-to-PCI Express Bridge B (6700PXH)' class =3D bridge subclass =3D PCI-PCI ciss0@pci0:2:1:0: class=3D0x010400 card=3D0x40910e11 chip=3D0x00460e1= 1 rev=3D0x01 hdr=3D0x00 vendor =3D 'Compaq Computer Corp (Now owned by Hewlett-Packard)' device =3D 'Smart Array 64xx/6i Controller' class =3D mass storage subclass =3D RAID bge0@pci0:2:2:0: class=3D0x020000 card=3D0x00d00e11 chip=3D0x164814e= 4 rev=3D0x10 hdr=3D0x00 vendor =3D 'Broadcom Corporation' device =3D 'NetXtreme Dual Gigabit Adapter (BCM5704)' class =3D network subclass =3D ethernet bge1@pci0:2:2:1: class=3D0x020000 card=3D0x00d00e11 chip=3D0x164814e= 4 rev=3D0x10 hdr=3D0x00 vendor =3D 'Broadcom Corporation' device =3D 'NetXtreme Dual Gigabit Adapter (BCM5704)' class =3D network subclass =3D ethernet vgapci0@pci0:1:3:0: class=3D0x030000 card=3D0x001e0e11 chip=3D0x4752100= 2 rev=3D0x27 hdr=3D0x00 vendor =3D 'ATI Technologies Inc. / Advanced Micro Devices, Inc.' device =3D 'ATI On-Board VGA for HP Proliant 350 G3 (Rage XL PCI)' class =3D display subclass =3D VGA none1@pci0:1:4:0: class=3D0x088000 card=3D0xb2060e11 chip=3D0xb2030e1= 1 rev=3D0x01 hdr=3D0x00 vendor =3D 'Compaq Computer Corp (Now owned by Hewlett-Packard)' device =3D 'Integrated Lights Out Processor (iLo)' class =3D base peripheral none2@pci0:1:4:2: class=3D0x088000 card=3D0xb2060e11 chip=3D0xb2040e1= 1 rev=3D0x01 hdr=3D0x00 vendor =3D 'Compaq Computer Corp (Now owned by Hewlett-Packard)' device =3D 'Integrated Lights Out Processor (iLo)' class =3D base peripheral and here is my /var/run/dmesg.boot Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.3-RELEASE-p4 #1: Fri Jan 7 18:24:07 UTC 2011 root@about-bsd:/usr/obj/usr/src/sys/DEBUG amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 3.60GHz (3600.15-MHz K8-class CPU) Origin =3D "GenuineIntel" Id =3D 0xf43 Stepping =3D 3 Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG= E,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=3D0x659d<SSE3,DTES64,MON,DS_CPL,EST,TM2,CNXT-ID,CX16,xTPR> AMD Features=3D0x20000800<SYSCALL,LM> TSC: P-state invariant Logical CPUs per core: 2 usable memory =3D 4214386688 (4019 MB) avail memory =3D 4051181568 (3863 MB) ACPI APIC Table: <HP 00000083> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP/HT): APIC ID: 1 cpu2 (AP): APIC ID: 6 cpu3 (AP/HT): APIC ID: 7 ioapic1: Changing APIC ID to 9 ioapic0 <Version 2.0> irqs 0-23 on motherboard ioapic1 <Version 2.0> irqs 24-47 on motherboard ioapic2 <Version 2.0> irqs 48-71 on motherboard ioapic3 <Version 2.0> irqs 72-95 on motherboard kbd1 at kbdmux0 acpi0: <HP P54> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x908-0x90b on acpi0 pcib0: <ACPI Host-PCI bridge> on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0 pci13: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> at device 4.0 on pci0 pci6: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> at device 0.0 on pci6 pci7: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> at device 0.2 on pci6 pci10: <ACPI PCI bus> on pcib4 pcib5: <ACPI PCI-PCI bridge> at device 6.0 on pci0 pci3: <ACPI PCI bus> on pcib5 pcib6: <ACPI PCI-PCI bridge> at device 28.0 on pci0 pci2: <ACPI PCI bus> on pcib6 ciss0: <HP Smart Array 6i> port 0x4000-0x40ff mem 0xfdff0000-0xfdff1fff,0xfdf80000-0xfdfbffff irq 24 at device 1.0 on pci2 ciss0: [ITHREAD] bge0: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem 0xfdf70000-0xfdf7ffff irq 25 at device 2.0 on pci2 miibus0: <MII bus> on bge0 brgphy0: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge0: Ethernet address: 00:17:a4:a7:a3:fc bge0: [ITHREAD] bge1: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem 0xfdf60000-0xfdf6ffff irq 26 at device 2.1 on pci2 miibus1: <MII bus> on bge1 brgphy1: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge1: Ethernet address: 00:17:a4:a7:a3:fb bge1: [ITHREAD] uhci0: <UHCI (generic) USB controller> port 0x2000-0x201f irq 16 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] ...skipping... device_attach: est2 attach returned 6 p4tcc2: <CPU Frequency Thermal Control> on cpu2 cpu3: <ACPI CPU> on acpi0 est3: <Enhanced SpeedStep Frequency Control> on cpu3 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 122900001229 device_attach: est3 attach returned 6 p4tcc3: <CPU Frequency Thermal Control> on cpu3 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xcbfff,0xee000-0xeffff on isa0 ppc0: cannot reserve I/O port range sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A sio1: [FILTER] vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: CDROM <HL-DT-ST GCR-8240N/2.03> at ata0-master PIO4 SMPd:a 0A Pa tC PcUi s#s01 bLuasu nc0h etda!rg et 0 lun 0 da0: <COMPAQ RAID 1 VOLUME OK> Fixed Direct Access SCSI-4 device da0: 135.168MB/s transfers da0: Command Queueing EnabledS MdPa:0 :A P6 9C4P5U9 M#B 2( 1L4a2u2n5c3h2e8d0! 512 byte sectors: 255H 32S/T 17433C) SMP: AP CPU #3 Launched! Trying to mount root from ufs:/dev/da0s1a WARNING: /var was not properly dismounted --=20 mark saad | nonesuch@longcount.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimHOBRvhpXCKZo-ddYHwaZ1M%2B2T9AAXNJ81LdR0>