Date: Fri, 6 Nov 2009 17:28:40 GMT From: Kai Gallasch <gallasch@free.de> To: freebsd-gnats-submit@FreeBSD.org Subject: kern/140338: FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld Message-ID: <200911061728.nA6HSeV4044890@www.freebsd.org> Resent-Message-ID: <200911061730.nA6HU1Ts070227@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 140338 >Category: kern >Synopsis: FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Nov 06 17:30:01 UTC 2009 >Closed-Date: >Last-Modified: >Originator: Kai Gallasch >Release: 8.0 RC2 amd64 >Organization: >Environment: Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-RC2 #0: Tue Nov 3 20:24:06 CET 2009 root@sonnenkraft.free.de:/usr/obj/usr/src/sys/GENERIC amd64 WARNING: WITNESS option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Quad-Core AMD Opteron(tm) Processor 2352 (2100.09-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x100f23 Stepping = 3 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x802009<SSE3,MON,CX16,POPCNT> AMD Features=0xee400800<SYSCALL,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!> AMD Features2=0x7ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS> TSC: P-state invariant real memory = 21474836480 (20480 MB) avail memory = 20701110272 (19742 MB) ACPI APIC Table: <HP ProLiant> FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs FreeBSD/SMP: 2 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 ioapic0 <Version 1.1> irqs 0-15 on motherboard ioapic1 <Version 1.1> irqs 16-31 on motherboard ioapic2 <Version 1.1> irqs 32-47 on motherboard kbd1 at kbdmux0 acpi0: <HP ProLiant> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x920-0x923 on acpi0 acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 pcib0: <ACPI Host-PCI bridge> on acpi0 pci0: <ACPI PCI bus> on pcib0 vgapci0: <VGA-compatible display> port 0x1000-0x10ff mem 0xe8000000-0xefffffff,0xf7ff0000-0xf7ffffff irq 44 at device 3.0 on pci0 pci0: <base peripheral> at device 4.0 (no driver attached) pci0: <base peripheral> at device 4.2 (no driver attached) uhci0: <UHCI (generic) USB controller> port 0x1800-0x181f irq 45 at device 4.4 on pci0 uhci0: [ITHREAD] usbus0: <UHCI (generic) USB controller> on uhci0 pci0: <serial bus> at device 4.6 (no driver attached) pcib1: <ACPI PCI-PCI bridge> at device 5.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> at device 13.0 on pci1 pci2: <ACPI PCI bus> on pcib2 atapci0: <ServerWorks HT1000 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x500-0x50f at device 6.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci0 ata1: [ITHREAD] isab0: <PCI-ISA bridge> at device 6.2 on pci0 isa0: <ISA bus> on isab0 ohci0: <OHCI (generic) USB controller> port 0x1c00-0x1cff mem 0xf7ee0000-0xf7ee0fff irq 5 at device 7.0 on pci0 ohci0: [ITHREAD] usbus1: <OHCI (generic) USB controller> on ohci0 ohci1: <OHCI (generic) USB controller> port 0x3000-0x30ff mem 0xf7ed0000-0xf7ed0fff irq 5 at device 7.1 on pci0 ohci1: [ITHREAD] usbus2: <OHCI (generic) USB controller> on ohci1 ehci0: <EHCI (generic) USB 2.0 controller> port 0x3400-0x34ff mem 0xf7ec0000-0xf7ec0fff irq 5 at device 7.2 on pci0 ehci0: [ITHREAD] usbus3: EHCI version 1.0 usbus3: <EHCI (generic) USB 2.0 controller> on ehci0 pcib3: <ACPI PCI-PCI bridge> irq 42 at device 15.0 on pci0 pci5: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> irq 38 at device 16.0 on pci0 pci8: <ACPI PCI bus> on pcib4 pcib5: <PCI-PCI bridge> irq 39 at device 17.0 on pci0 pci14: <PCI bus> on pcib5 pcib6: <ACPI PCI-PCI bridge> irq 40 at device 18.0 on pci0 pci11: <ACPI PCI bus> on pcib6 pcib7: <ACPI PCI-PCI bridge> irq 41 at device 19.0 on pci0 pci3: <ACPI PCI bus> on pcib7 pcib8: <PCI-PCI bridge> at device 0.0 on pci3 pci4: <PCI bus> on pcib8 bce0: <HP NC373i Multifunction Gigabit Server Adapter (B2)> mem 0xf8000000-0xf9ffffff irq 41 at device 0.0 on pci4 miibus0: <MII bus> on bce0 brgphy0: <BCM5708C 10/100/1000baseTX PHY> PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bce0: Ethernet address: 00:1b:78:38:dd:02 bce0: [ITHREAD] bce0: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C (1.9.6); Flags (MSI|MFW); MFW () pcib9: <ACPI Host-PCI bridge> on acpi0 pci64: <ACPI PCI bus> on pcib9 pcib10: <ACPI PCI-PCI bridge> irq 36 at device 15.0 on pci64 pci67: <ACPI PCI bus> on pcib10 pcib11: <ACPI PCI-PCI bridge> irq 32 at device 16.0 on pci64 pci70: <ACPI PCI bus> on pcib11 ciss0: <HP Smart Array P400> port 0x4000-0x40ff mem 0xfdf00000-0xfdffffff,0xfdef0000-0xfdef0fff irq 32 at device 0.0 on pci70 ciss0: PERFORMANT Transport ciss0: [ITHREAD] pcib12: <PCI-PCI bridge> irq 33 at device 17.0 on pci64 pci73: <PCI bus> on pcib12 pcib13: <ACPI PCI-PCI bridge> irq 34 at device 18.0 on pci64 pci65: <ACPI PCI bus> on pcib13 pcib14: <PCI-PCI bridge> at device 0.0 on pci65 pci66: <PCI bus> on pcib14 bce1: <HP NC373i Multifunction Gigabit Server Adapter (B2)> mem 0xfa000000-0xfbffffff irq 34 at device 0.0 on pci66 miibus1: <MII bus> on bce1 brgphy1: <BCM5708C 10/100/1000baseTX PHY> PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bce1: Ethernet address: 00:1b:78:38:dd:00 bce1: [ITHREAD] bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C (1.9.6); Flags (MSI|MFW); MFW () pcib15: <PCI-PCI bridge> irq 35 at device 19.0 on pci64 pci74: <PCI bus> on pcib15 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse, device ID 3 atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] uart0: console (9600,n,8,1) cpu0: <ACPI CPU> on acpi0 hwpstate0: <Cool`n'Quiet 2.0> on cpu0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 cpu4: <ACPI CPU> on acpi0 cpu5: <ACPI CPU> on acpi0 cpu6: <ACPI CPU> on acpi0 cpu7: <ACPI CPU> on acpi0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xcb000-0xcefff,0xe5000-0xe6fff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: cannot reserve I/O port range uart1: <Non-standard ns8250 class UART with FIFOs> at port 0x2f8-0x2ff irq 3 on isa0 uart1: [FILTER] Timecounters tick every 1.000 msec usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 480Mbps High Speed USB v2.0 acd0: CDRW <TSSTcorpCDW/DVD TS-L462D/HG01> at ata0-master UDMA33 ugen0.1: <(0x103c)> at usbus0 uhub0: <(0x103c) UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0 ugen1.1: <(0x1166)> at usbus1 uhub1: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1 ugen2.1: <(0x1166)> at usbus2 uhub2: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2 ugen3.1: <(0x1166)> at usbus3 uhub3: <(0x1166) EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus3 uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub0: 2 ports with 2 removable, self powered ugen0.2: <HP> at usbus0 ukbd0: <Virtual Keyboard> on usbus0 kbd2 at ukbd0 ums0: <Virtual Mouse> on usbus0 ums0: 3 buttons and [XY] coordinates ID=0 uhub3: 4 ports with 4 removable, self powered ugen0.3: <HP> at usbus0 uhub4: <Virtual Hub> on usbus0 ugen3.2: <vendor 0x04b4> at usbus3 uhub5: <vendor 0x04b4 product 0x6560, class 9/0, rev 2.00/0.0b, addr 2> on usbus3 uhub5: 4 ports with 4 removable, self powered uhub4: 7 ports with 7 removable, self powered da0 at ciss0 bus 0 target 0 lun 0 da0: <COMPAQ RAID 5 VOLUME OK> Fixed Direct Access SCSI-5 device da0: 135.168MB/s transfers da0: Command Queueing enabled da0: 36863MB (75496320 512 byte sectors: 255H 32S/T 9252C) da1 at ciss0 bus 0 target 1 lun 0 da1: <COMPAQ RAID 5 VOLUME OK> Fixed Direct Access SCSI-5 device da1: 135.168MB/s transfers da1: Command Queueing enabled da1: 243098MB (497866080 512 byte sectors: 255H 32S/T 61013C) da2 at ciss0 bus 0 target 2 lun 0 da2: <COMPAQ RAID 0 VOLUME OK> Fixed Direct Access SCSI-5 device da2: 135.168MB/s transfers da2: Command Queueing enabled da2: 139979MB (286677120 512 byte sectors: 255H 32S/T 35132C) da3 at ciss0 bus 0 target 3 lun 0 da3: <COMPAQ RAID 0 VOLUME OK> Fixed Direct Access SCSI-5 device da3: 135.168MB/s transfers da3: Command Queueing enabled da3: 139979MB (286677120 512 byte sectors: 255H 32S/T 35132C) da4 at ciss0 bus 0 target 4 lun 0 da4: <COMPAQ RAID 0 VOLUME OK> Fixed Direct Access SCSI-5 device da4: 135.168MB/s transfers da4: Command Queueing enabled da4: 139979MB (286677120 512 byte sectors: 255H 32S/T 35132C) da5 at ciss0 bus 0 target 5 lun 0 da5: <COMPAQ RAID 0 VOLUME OK> Fixed Direct Access SCSI-5 device da5: 135.168MB/s transfers da5: Command Queueing enabled da5: 139979MB (286677120 512 byte sectors: 255H 32S/T 35132C) SMP: AP CPU #1 Launched! SMP: AP CPU #7 Launched! SMP: AP CPU #6 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #4 Launched! SMP: AP CPU #5 Launched! SMP: AP CPU #2 Launched! WARNING: WITNESS option enabled, expect reduced performance.GEOM: da0: partition 3 does not start on a track boundary. GEOM: da0: partition 3 does not end on a track boundary. GEOM: da0: partition 2 does not start on a track boundary. GEOM: da0: partition 2 does not end on a track boundary. GEOM: da0: partition 1 does not start on a track boundary. GEOM: da0: partition 1 does not end on a track boundary. GEOM: da0s1: geometry does not match label (255h,63s != 255h,32s). GEOM: da0s2: geometry does not match label (255h,63s != 255h,32s). GEOM: da0s3: geometry does not match label (255h,63s != 255h,32s). Trying to mount root from ufs:/dev/da0s1a ZFS filesystem version 13 ZFS storage pool version 13 bce0: link state changed to UP >Description: I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago when 8.0 RC2 came out. When I tried to do a make buildworld or make buildkernel the server rebooted without any message left in the logs. The same happened when building bigger ports (for example ruby18 or perl58) After this I installed 7.2-STABLE on this same server and did a "make buildworld" and "make buildkernel" which completed without any problem. Then I installed 8.0-BETA4 (crashes also when doing makeworld) Finally I reinstalled 8.0RC2-amd64 on the server again and build a 8.0RC2 debug kernel on another amd server for this crashing server. I also: - ran several passes with diagnostic software from the server manufacturer - reset BIOS settings to default - upgraded BIOS to newest release - booted server from 2 year old backup BIOS - took out the only pair of RAM modules that was different from the rest of the modules - ran memtest86 on the server (no problems found) The server kept on crashing under load, when running buildworld. Although dumpdev + dumpdir were correctly defined, the server just rebooted without writing a crashdump! - Running a makeworld in about 80% leads to a server crash without the server writing a crashdump to dumpdir. The server just reboots.. - In about 20% of the cases makeworld gets stuck in a not terminating process that eats up 100% cpu. This process cannot be killed. When restarting makeworld the server then reboots again - It makes no difference doing makeworld -j1 or -j8, result is the same Finally, I followed a hint I got on the freebsd-current list and set vm.pmap.pg_ps_enabled=0 in /boot/loader.conf an rebooted. The problem was gone! After successful buildworld and buildkernel I rebooted the server again with commented out vm.pmap.pg_ps_enabled=0 and the problem was there again. And then I set vm.pmap.pg_ps_enabled=0 again in loader.conf, rebooted + make buildworld .. no problem. Seems to be deterministic. With vm.pmap.pg_ps_enabled=1 the server crashes without being able to write crashdumps to dumpdev. (at least on this specific HP Proliant DL385G2 server with 20G RAM) >How-To-Repeat: Install FreeBSD 8.0 RC2 amd64 + Sources, do a makeworld. >Fix: Workaround: Setting vm.pmap.pg_ps_enabled=0 in loader.conf and reboot. >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200911061728.nA6HSeV4044890>