Date: Thu, 30 Jun 2005 21:51:22 -0400 (EDT) From: Chris Gabe <chris@borderware.com> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/82846: Kernel crash in 5.4 with SMP,PAE Message-ID: <20050701015122.D587BA9B6@santana.borderware.com> Resent-Message-ID: <200507010200.j6120Z6X077891@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 82846 >Category: kern >Synopsis: Kernel crash in 5.4 with SMP,PAE >Confidential: no >Severity: critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Jul 01 02:00:34 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Chris Gabe >Release: FreeBSD 5.4 i386 >Organization: Borderware >Environment: System: FreeBSD santana.borderware.com 4.7-RELEASE-p20 FreeBSD 4.7-RELEASE-p20 #1: Fri Sep 26 13:30:29 EDT 2003 root@santana.borderware.com:/usr/obj/usr/src/sys/SANTANA i386 >Description: Hello, I've got a kernel crash on a Sun V40Z quad CPU, with FreeBSD 5.4 SMP, PAE (and kernel debugging), 8GB ram. It happens every few hours. System is not using a lot of memory at that time, but it's usually after accessing over 4GB of files, in separate chunks to a total of only a few MB of user memory. I've hand transcribed the kernel trace below, and I haven't got the dmesg right now but a kernel log file from a 4.10 build we previously ran on the same hardware shows the basic idea. An LSI RAID controller, mirrored/striped SCSI hard drives. We're just wondering what direction to head with this. Any advice? Add more debugging, get a full crash dump, submit to something/someone, change kernel config option, sync to driver that has a fix for this (that would be a good one). hand transcribed kernel trace: kdb_enter panic lockmgr(ca71ce14,6,ca71cd68,0,f0147a1c) + 0x421 vop_stdunlock(<5 addresses>) + 1f vop_defaultop(<4 addresses>,1000) + 13 spec_vnoperate(didn't transcribe any more) + 13 spec_write 64 spec_vnoperate 13 vnode_pager_generic_putpages 224 vop_stdputpages 1a vop_defaultop 13 spec_vnoperate 13 vnode_pager_putpages 8a vm_pageout_flush cb vm_pageout_clean 2a1 vm_pageout_scan 706 vm_pageout 312 fork_exit 75 fork_trampoline 8 trap 0x1 eip=0, esp = 0xf0147d7c, ebp = 0 The kernel boot log file from 4.10 (sorry, I could get 5.4 dmesg but not until end of next week): devices amr, mpt perhaps of extra relevance(?) Jun 22 13:00:00 fifty newsyslog[11157]: logfile turned over due to size>1K Jun 22 13:09:53 fifty /kernel: Copyright 1998-2004 BorderWare Technologies Inc. All rights reserved. Jun 22 13:09:53 fifty /kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jun 22 13:09:53 fifty /kernel: The Regents of the University of California. All rights reserved. Jun 22 13:09:53 fifty /kernel: S-CORE 8.00 #14: Mon Jun 13 09:27:22 EDT 2005 Jun 22 13:09:53 fifty /kernel: support@borderware.com:/sys/compile/S-CORE_SMP Jun 22 13:09:53 fifty /kernel: Timecounter "i8254" frequency 1193182 Hz Jun 22 13:09:53 fifty /kernel: CPU: AMD Opteron(tm) Processor 850 (2391.27-MHz 686-class CPU) Jun 22 13:09:53 fifty /kernel: Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 Jun 22 13:09:53 fifty /kernel: Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2> Jun 22 13:09:53 fifty /kernel: AMD Features=0xe0500000<<b20>,AMIE,<b29>,DSP,3DNow!> Jun 22 13:09:53 fifty /kernel: real memory = 3824615424 (3734976K bytes) Jun 22 13:09:53 fifty /kernel: avail memory = 3724136448 (3636852K bytes) Jun 22 13:09:53 fifty /kernel: Programming 24 pins in IOAPIC #0 Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 2 -> irq 0 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #1 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #2 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #3 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #4 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #5 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #6 Jun 22 13:09:53 fifty /kernel: FreeBSD/SMP: Multiprocessor motherboard: 4 CPUs Jun 22 13:09:53 fifty /kernel: cpu0 (BSP): apic id: 0, version: 0x00040010, at 0xfee00000 Jun 22 13:09:53 fifty /kernel: cpu1 (AP): apic id: 1, version: 0x00040010, at 0xfee00000 Jun 22 13:09:53 fifty /kernel: cpu2 (AP): apic id: 2, version: 0x00040010, at 0xfee00000 Jun 22 13:09:53 fifty /kernel: cpu3 (AP): apic id: 3, version: 0x00040010, at 0xfee00000 Jun 22 13:09:53 fifty /kernel: io0 (APIC): apic id: 4, version: 0x00170011, at 0xfec00000 Jun 22 13:09:53 fifty /kernel: io1 (APIC): apic id: 5, version: 0x00030011, at 0xe4000000 Jun 22 13:09:53 fifty /kernel: io2 (APIC): apic id: 6, version: 0x00030011, at 0xe4001000 Jun 22 13:09:53 fifty /kernel: io3 (APIC): apic id: 7, version: 0x00030011, at 0xe5d01000 Jun 22 13:09:53 fifty /kernel: io4 (APIC): apic id: 8, version: 0x00030011, at 0xe5d03000 Jun 22 13:09:53 fifty /kernel: io5 (APIC): apic id: 9, version: 0x00030011, at 0xe5d05000 Jun 22 13:09:53 fifty /kernel: io6 (APIC): apic id: 10, version: 0x00030011, at 0xe5d07000 Jun 22 13:09:53 fifty /kernel: Preloaded elf kernel "kernel" at 0xc0455000. Jun 22 13:09:53 fifty /kernel: Preloaded elf module "splash_bmp.ko" at 0xc045509c. Jun 22 13:09:53 fifty /kernel: Preloaded splash_image_data "/boot/splash.bmp" at 0xc0455140. Jun 22 13:09:53 fifty /kernel: Pentium Pro MTRR support enabled Jun 22 13:09:53 fifty /kernel: md0: Malloc disk Jun 22 13:09:53 fifty /kernel: Using $PIR table, 24 entries at 0xc00fde40 Jun 22 13:09:53 fifty /kernel: npx0: <math processor> on motherboard Jun 22 13:09:53 fifty /kernel: npx0: INT 16 interface Jun 22 13:09:53 fifty /kernel: pcib0: <Host to PCI bridge> on motherboard Jun 22 13:09:53 fifty /kernel: pci0: <PCI bus> on pcib0 Jun 22 13:09:53 fifty /kernel: pcib16: <PCI to PCI bridge (vendor=1022 device=7460)> at device 6.0 on pci0 Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 19 -> irq 2 Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 17 -> irq 16 Jun 22 13:09:53 fifty /kernel: pci1: <PCI bus> on pcib16 Jun 22 13:09:53 fifty /kernel: pci1: <OHCI USB controller> at 0.0 irq 2 Jun 22 13:09:53 fifty /kernel: pci1: <OHCI USB controller> at 0.1 irq 2 Jun 22 13:09:53 fifty /kernel: pci1: <Trident model 9880 VGA-compatible display device> at 5.0 irq 16 Jun 22 13:09:53 fifty /kernel: isab0: <PCI to ISA bridge (vendor=1022 device=7468)> at device 7.0 on pci0 Jun 22 13:09:53 fifty /kernel: isa0: <ISA bus> on isab0 Jun 22 13:09:54 fifty /kernel: atapci0: <AMD 8111 ATA133 controller> port 0x1000-0x100f at device 7.1 on pci0 Jun 22 13:09:54 fifty /kernel: ata0: at 0x1f0 irq 14 on atapci0 Jun 22 13:09:54 fifty /kernel: ata1: at 0x170 irq 15 on atapci0 Jun 22 13:09:54 fifty /kernel: chip0: <PCI to Other bridge (vendor=1022 device=746b)> at device 7.3 on pci0 Jun 22 13:09:54 fifty /kernel: pcib17: <PCI to PCI bridge (vendor=1022 device=7450)> at device 10.0 on pci0 Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 1 -> irq 17 Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 2 -> irq 18 Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 3 -> irq 19 Jun 22 13:09:54 fifty /kernel: pci2: <PCI bus> on pcib17 Jun 22 13:09:54 fifty /kernel: bge0: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem 0xe5800000-0xe580ffff irq 17 at device 2.0 on pci2 Jun 22 13:09:54 fifty /kernel: bge0: Ethernet address: 00:09:3d:00:d4:e1 Jun 22 13:09:54 fifty /kernel: miibus0: <MII bus> on bge0 Jun 22 13:09:54 fifty /kernel: brgphy0: <BCM5703 10/100/1000baseTX PHY> on miibus0 Jun 22 13:09:54 fifty /kernel: brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto Jun 22 13:09:54 fifty /kernel: bge1: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem 0xe5810000-0xe581ffff irq 18 at device 3.0 on pci2 Jun 22 13:09:54 fifty /kernel: bge1: Ethernet address: 00:09:3d:00:d4:e2 Jun 22 13:09:54 fifty /kernel: miibus1: <MII bus> on bge1 Jun 22 13:09:54 fifty /kernel: brgphy1: <BCM5703 10/100/1000baseTX PHY> on miibus1 Jun 22 13:09:54 fifty /kernel: brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto Jun 22 13:09:54 fifty /kernel: mpt0: <LSILogic 1030 Ultra4 Adapter> port 0x2000-0x20ff mem 0xe5820000-0xe582ffff,0xe5830000-0xe583ffff irq 19 at device 4.0 on pci2 Jun 22 13:09:54 fifty /kernel: pcib18: <PCI to PCI bridge (vendor=1014 device=01a7)> at device 5.0 on pci2 Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 0 -> irq 20 Jun 22 13:09:54 fifty /kernel: pci3: <PCI bus> on pcib18 Jun 22 13:09:54 fifty /kernel: amr0: <LSILogic MegaRAID> mem 0xe5900000-0xe597ffff,0xe5c00000-0xe5c0ffff irq 20 at device 0.0 on pci3 Jun 22 13:09:54 fifty /kernel: amr0: <LSILogic MegaRAID SCSI 320-2X> Firmware 413G, BIOS H414, 128MB RAM Jun 22 13:09:54 fifty /kernel: pci0: <unknown card> (vendor=0x1022, dev=0x7451) at 10.1 Jun 22 13:09:54 fifty /kernel: pcib19: <PCI to PCI bridge (vendor=1022 device=7450)> at device 11.0 on pci0 Jun 22 13:09:54 fifty /kernel: pci4: <PCI bus> on pcib19 Jun 22 13:09:54 fifty /kernel: pci0: <unknown card> (vendor=0x1022, dev=0x7451) at 11.1 Jun 22 13:09:54 fifty /kernel: pcib1: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci5: <PCI bus> on pcib1 Jun 22 13:09:54 fifty /kernel: pcib2: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci6: <PCI bus> on pcib2 Jun 22 13:09:54 fifty /kernel: pcib3: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci7: <PCI bus> on pcib3 Jun 22 13:09:54 fifty /kernel: pcib4: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci8: <PCI bus> on pcib4 Jun 22 13:09:54 fifty /kernel: pcib5: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci9: <PCI bus> on pcib5 Jun 22 13:09:54 fifty /kernel: pcib6: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci10: <PCI bus> on pcib6 Jun 22 13:09:54 fifty /kernel: pcib7: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci11: <PCI bus> on pcib7 Jun 22 13:09:54 fifty /kernel: pcib8: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci12: <PCI bus> on pcib8 Jun 22 13:09:54 fifty /kernel: pcib9: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci13: <PCI bus> on pcib9 Jun 22 13:09:54 fifty /kernel: pcib10: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci14: <PCI bus> on pcib10 Jun 22 13:09:54 fifty /kernel: pcib11: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci15: <PCI bus> on pcib11 Jun 22 13:09:54 fifty /kernel: pcib12: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci16: <PCI bus> on pcib12 Jun 22 13:09:54 fifty /kernel: pcib13: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci17: <PCI bus> on pcib13 Jun 22 13:09:54 fifty /kernel: pcib14: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci18: <PCI bus> on pcib14 Jun 22 13:09:54 fifty /kernel: pcib15: <Host to PCI bridge> on motherboard Jun 22 13:09:54 fifty /kernel: pci19: <PCI bus> on pcib15 Jun 22 13:09:54 fifty /kernel: orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcafff,0xcb000-0xcb7ff on isa0 Jun 22 13:09:54 fifty /kernel: pmtimer0 on isa0 Jun 22 13:09:54 fifty /kernel: atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 Jun 22 13:09:54 fifty /kernel: atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 Jun 22 13:09:54 fifty /kernel: kbd0 at atkbd0 Jun 22 13:09:54 fifty /kernel: vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Jun 22 13:09:54 fifty /kernel: sc0: <System console> at flags 0x100 on isa0 Jun 22 13:09:54 fifty /kernel: sc0: VGA <16 virtual consoles, flags=0x300> Jun 22 13:09:54 fifty /kernel: sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 Jun 22 13:09:54 fifty /kernel: sio0: type 16550A Jun 22 13:09:54 fifty /kernel: sio1 at port 0x2f8-0x2ff irq 3 on isa0 Jun 22 13:09:54 fifty /kernel: sio1: type 16550A Jun 22 13:09:54 fifty /kernel: ppc0: parallel port not found. Jun 22 13:09:54 fifty /kernel: APIC_IO: Testing 8254 interrupt delivery Jun 22 13:09:54 fifty /kernel: APIC_IO: routing 8254 via IOAPIC #0 intpin 2 Jun 22 13:09:54 fifty /kernel: ipfw2 initialized, divert disabled, rule-based forwarding enabled, default to deny, logging unlimited Jun 22 13:09:54 fifty /kernel: SMP: AP CPU #1 Launched! Jun 22 13:09:54 fifty /kernel: SMP: AP CPU #3 Launched! Jun 22 13:09:54 fifty /kernel: SMP: AP CPU #2 Launched! Jun 22 13:09:54 fifty /kernel: acd0: DVD-ROM <DV-28E-C> at ata1-master PIO4 Jun 22 13:09:54 fifty /kernel: Waiting 15 seconds for SCSI devices to settle Jun 22 13:09:54 fifty /kernel: amrd0: <LSILogic MegaRAID logical drive> on amr0 Jun 22 13:09:54 fifty /kernel: amrd0: 140006MB (286732288 sectors) RAID 1 (optimal) Jun 22 13:09:54 fifty /kernel: pass0 at amr0 bus 0 target 6 lun 0 Jun 22 13:09:54 fifty /kernel: pass0: <SDR GEM318P 1> Fixed Processor SCSI-2 device Jun 22 13:09:54 fifty /kernel: Mounting root from ufs:/dev/amrd0s2a Jun 22 13:09:54 fifty /kernel: WARNING: / was not properly dismounted Jun 22 13:11:17 fifty /kernel: bge0: gigabit link up Jun 22 14:00:01 fifty newsyslog[62771]: logfile turned over due to size>1K Jun 22 14:00:12 fifty mxsyslog[60040] logfile enrolled as /server/ftp/log/kernel.4 >How-To-Repeat: complex. see above >Fix: >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050701015122.D587BA9B6>