Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Jun 2005 21:51:22 -0400 (EDT)
From:      Chris Gabe <chris@borderware.com>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   kern/82846: Kernel crash in 5.4 with SMP,PAE
Message-ID:  <20050701015122.D587BA9B6@santana.borderware.com>
Resent-Message-ID: <200507010200.j6120Z6X077891@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         82846
>Category:       kern
>Synopsis:       Kernel crash in 5.4 with SMP,PAE
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Jul 01 02:00:34 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator:     Chris Gabe
>Release:        FreeBSD 5.4 i386
>Organization:
Borderware
>Environment:
System: FreeBSD santana.borderware.com 4.7-RELEASE-p20 FreeBSD 4.7-RELEASE-p20 #1: Fri Sep 26 13:30:29 EDT 2003 root@santana.borderware.com:/usr/obj/usr/src/sys/SANTANA i386


	
>Description:
Hello,

I've got a kernel crash on a Sun V40Z quad CPU, with FreeBSD 5.4 SMP, PAE (and kernel debugging), 8GB ram.  It happens every few hours.  System is not using a lot of memory at that time, but it's usually after accessing over 4GB of files, in separate chunks to a total of only a few MB of user memory.

I've hand transcribed the kernel trace below, and I haven't got the dmesg right now but a kernel log file from a 4.10 build we previously ran on the same hardware shows the basic idea.  An LSI RAID controller, mirrored/striped SCSI hard drives.

We're just wondering what direction to head with this.  Any advice?  Add more debugging, get a full crash dump, submit to something/someone, change kernel config option, sync to driver that has a fix for this (that would be a good one).

hand transcribed kernel trace:
kdb_enter
panic
lockmgr(ca71ce14,6,ca71cd68,0,f0147a1c) + 0x421
vop_stdunlock(<5 addresses>) + 1f
vop_defaultop(<4 addresses>,1000) + 13
spec_vnoperate(didn't transcribe any more) + 13
spec_write 64
spec_vnoperate 13
vnode_pager_generic_putpages 224
vop_stdputpages 1a
vop_defaultop 13
spec_vnoperate 13
vnode_pager_putpages 8a
vm_pageout_flush cb
vm_pageout_clean 2a1
vm_pageout_scan 706
vm_pageout 312
fork_exit 75
fork_trampoline 8
trap 0x1 eip=0, esp = 0xf0147d7c, ebp = 0


The kernel boot log file from 4.10 (sorry, I could get 5.4 dmesg but not until end of next week):
devices amr, mpt perhaps of extra relevance(?)

Jun 22 13:00:00 fifty newsyslog[11157]: logfile turned over due to size>1K
Jun 22 13:09:53 fifty /kernel: Copyright 1998-2004 BorderWare Technologies Inc.  All rights reserved.
Jun 22 13:09:53 fifty /kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
Jun 22 13:09:53 fifty /kernel: The Regents of the University of California. All rights reserved.
Jun 22 13:09:53 fifty /kernel: S-CORE 8.00 #14: Mon Jun 13 09:27:22 EDT 2005
Jun 22 13:09:53 fifty /kernel: support@borderware.com:/sys/compile/S-CORE_SMP
Jun 22 13:09:53 fifty /kernel: Timecounter "i8254"  frequency 1193182 Hz
Jun 22 13:09:53 fifty /kernel: CPU: AMD Opteron(tm) Processor 850 (2391.27-MHz 686-class CPU)
Jun 22 13:09:53 fifty /kernel: Origin = "AuthenticAMD"  Id = 0xf5a  Stepping = 10
Jun 22 13:09:53 fifty /kernel: Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
Jun 22 13:09:53 fifty /kernel: AMD Features=0xe0500000<<b20>,AMIE,<b29>,DSP,3DNow!>
Jun 22 13:09:53 fifty /kernel: real memory  = 3824615424 (3734976K bytes)
Jun 22 13:09:53 fifty /kernel: avail memory = 3724136448 (3636852K bytes)
Jun 22 13:09:53 fifty /kernel: Programming 24 pins in IOAPIC #0
Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 2 -> irq 0
Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #1
Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #2
Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #3
Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #4
Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #5
Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #6
Jun 22 13:09:53 fifty /kernel: FreeBSD/SMP: Multiprocessor motherboard: 4 CPUs
Jun 22 13:09:53 fifty /kernel: cpu0 (BSP): apic id:  0, version: 0x00040010, at 0xfee00000
Jun 22 13:09:53 fifty /kernel: cpu1 (AP):  apic id:  1, version: 0x00040010, at 0xfee00000
Jun 22 13:09:53 fifty /kernel: cpu2 (AP):  apic id:  2, version: 0x00040010, at 0xfee00000
Jun 22 13:09:53 fifty /kernel: cpu3 (AP):  apic id:  3, version: 0x00040010, at 0xfee00000
Jun 22 13:09:53 fifty /kernel: io0 (APIC): apic id:  4, version: 0x00170011, at 0xfec00000
Jun 22 13:09:53 fifty /kernel: io1 (APIC): apic id:  5, version: 0x00030011, at 0xe4000000
Jun 22 13:09:53 fifty /kernel: io2 (APIC): apic id:  6, version: 0x00030011, at 0xe4001000
Jun 22 13:09:53 fifty /kernel: io3 (APIC): apic id:  7, version: 0x00030011, at 0xe5d01000
Jun 22 13:09:53 fifty /kernel: io4 (APIC): apic id:  8, version: 0x00030011, at 0xe5d03000
Jun 22 13:09:53 fifty /kernel: io5 (APIC): apic id:  9, version: 0x00030011, at 0xe5d05000
Jun 22 13:09:53 fifty /kernel: io6 (APIC): apic id: 10, version: 0x00030011, at 0xe5d07000
Jun 22 13:09:53 fifty /kernel: Preloaded elf kernel "kernel" at 0xc0455000.
Jun 22 13:09:53 fifty /kernel: Preloaded elf module "splash_bmp.ko" at 0xc045509c.
Jun 22 13:09:53 fifty /kernel: Preloaded splash_image_data "/boot/splash.bmp" at 0xc0455140.
Jun 22 13:09:53 fifty /kernel: Pentium Pro MTRR support enabled
Jun 22 13:09:53 fifty /kernel: md0: Malloc disk
Jun 22 13:09:53 fifty /kernel: Using $PIR table, 24 entries at 0xc00fde40
Jun 22 13:09:53 fifty /kernel: npx0: <math processor> on motherboard
Jun 22 13:09:53 fifty /kernel: npx0: INT 16 interface
Jun 22 13:09:53 fifty /kernel: pcib0: <Host to PCI bridge> on motherboard
Jun 22 13:09:53 fifty /kernel: pci0: <PCI bus> on pcib0
Jun 22 13:09:53 fifty /kernel: pcib16: <PCI to PCI bridge (vendor=1022 device=7460)> at device 6.0 on pci0
Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 19 -> irq 2
Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 17 -> irq 16
Jun 22 13:09:53 fifty /kernel: pci1: <PCI bus> on pcib16
Jun 22 13:09:53 fifty /kernel: pci1: <OHCI USB controller> at 0.0 irq 2
Jun 22 13:09:53 fifty /kernel: pci1: <OHCI USB controller> at 0.1 irq 2
Jun 22 13:09:53 fifty /kernel: pci1: <Trident model 9880 VGA-compatible display device> at 5.0 irq 16
Jun 22 13:09:53 fifty /kernel: isab0: <PCI to ISA bridge (vendor=1022 device=7468)> at device 7.0 on pci0
Jun 22 13:09:53 fifty /kernel: isa0: <ISA bus> on isab0
Jun 22 13:09:54 fifty /kernel: atapci0: <AMD 8111 ATA133 controller> port 0x1000-0x100f at device 7.1 on pci0
Jun 22 13:09:54 fifty /kernel: ata0: at 0x1f0 irq 14 on atapci0
Jun 22 13:09:54 fifty /kernel: ata1: at 0x170 irq 15 on atapci0
Jun 22 13:09:54 fifty /kernel: chip0: <PCI to Other bridge (vendor=1022 device=746b)> at device 7.3 on pci0
Jun 22 13:09:54 fifty /kernel: pcib17: <PCI to PCI bridge (vendor=1022 device=7450)> at device 10.0 on pci0
Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 1 -> irq 17
Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 2 -> irq 18
Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 3 -> irq 19
Jun 22 13:09:54 fifty /kernel: pci2: <PCI bus> on pcib17
Jun 22 13:09:54 fifty /kernel: bge0: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem 0xe5800000-0xe580ffff irq 17 at device 2.0 on pci2
Jun 22 13:09:54 fifty /kernel: bge0: Ethernet address: 00:09:3d:00:d4:e1
Jun 22 13:09:54 fifty /kernel: miibus0: <MII bus> on bge0
Jun 22 13:09:54 fifty /kernel: brgphy0: <BCM5703 10/100/1000baseTX PHY> on miibus0
Jun 22 13:09:54 fifty /kernel: brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
Jun 22 13:09:54 fifty /kernel: bge1: <Broadcom BCM5703 Gigabit Ethernet, ASIC rev. 0x1002> mem 0xe5810000-0xe581ffff irq 18 at device 3.0 on pci2
Jun 22 13:09:54 fifty /kernel: bge1: Ethernet address: 00:09:3d:00:d4:e2
Jun 22 13:09:54 fifty /kernel: miibus1: <MII bus> on bge1
Jun 22 13:09:54 fifty /kernel: brgphy1: <BCM5703 10/100/1000baseTX PHY> on miibus1
Jun 22 13:09:54 fifty /kernel: brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
Jun 22 13:09:54 fifty /kernel: mpt0: <LSILogic 1030 Ultra4 Adapter> port 0x2000-0x20ff mem 0xe5820000-0xe582ffff,0xe5830000-0xe583ffff irq 19 at device 4.0 on pci2
Jun 22 13:09:54 fifty /kernel: pcib18: <PCI to PCI bridge (vendor=1014 device=01a7)> at device 5.0 on pci2
Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 0 -> irq 20
Jun 22 13:09:54 fifty /kernel: pci3: <PCI bus> on pcib18
Jun 22 13:09:54 fifty /kernel: amr0: <LSILogic MegaRAID> mem 0xe5900000-0xe597ffff,0xe5c00000-0xe5c0ffff irq 20 at device 0.0 on pci3
Jun 22 13:09:54 fifty /kernel: amr0: <LSILogic MegaRAID SCSI 320-2X> Firmware 413G, BIOS H414, 128MB RAM
Jun 22 13:09:54 fifty /kernel: pci0: <unknown card> (vendor=0x1022, dev=0x7451) at 10.1
Jun 22 13:09:54 fifty /kernel: pcib19: <PCI to PCI bridge (vendor=1022 device=7450)> at device 11.0 on pci0
Jun 22 13:09:54 fifty /kernel: pci4: <PCI bus> on pcib19
Jun 22 13:09:54 fifty /kernel: pci0: <unknown card> (vendor=0x1022, dev=0x7451) at 11.1
Jun 22 13:09:54 fifty /kernel: pcib1: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci5: <PCI bus> on pcib1
Jun 22 13:09:54 fifty /kernel: pcib2: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci6: <PCI bus> on pcib2
Jun 22 13:09:54 fifty /kernel: pcib3: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci7: <PCI bus> on pcib3
Jun 22 13:09:54 fifty /kernel: pcib4: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci8: <PCI bus> on pcib4
Jun 22 13:09:54 fifty /kernel: pcib5: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci9: <PCI bus> on pcib5
Jun 22 13:09:54 fifty /kernel: pcib6: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci10: <PCI bus> on pcib6
Jun 22 13:09:54 fifty /kernel: pcib7: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci11: <PCI bus> on pcib7
Jun 22 13:09:54 fifty /kernel: pcib8: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci12: <PCI bus> on pcib8
Jun 22 13:09:54 fifty /kernel: pcib9: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci13: <PCI bus> on pcib9
Jun 22 13:09:54 fifty /kernel: pcib10: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci14: <PCI bus> on pcib10
Jun 22 13:09:54 fifty /kernel: pcib11: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci15: <PCI bus> on pcib11
Jun 22 13:09:54 fifty /kernel: pcib12: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci16: <PCI bus> on pcib12
Jun 22 13:09:54 fifty /kernel: pcib13: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci17: <PCI bus> on pcib13
Jun 22 13:09:54 fifty /kernel: pcib14: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci18: <PCI bus> on pcib14
Jun 22 13:09:54 fifty /kernel: pcib15: <Host to PCI bridge> on motherboard
Jun 22 13:09:54 fifty /kernel: pci19: <PCI bus> on pcib15
Jun 22 13:09:54 fifty /kernel: orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcafff,0xcb000-0xcb7ff on isa0
Jun 22 13:09:54 fifty /kernel: pmtimer0 on isa0
Jun 22 13:09:54 fifty /kernel: atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
Jun 22 13:09:54 fifty /kernel: atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
Jun 22 13:09:54 fifty /kernel: kbd0 at atkbd0
Jun 22 13:09:54 fifty /kernel: vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Jun 22 13:09:54 fifty /kernel: sc0: <System console> at flags 0x100 on isa0
Jun 22 13:09:54 fifty /kernel: sc0: VGA <16 virtual consoles, flags=0x300>
Jun 22 13:09:54 fifty /kernel: sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
Jun 22 13:09:54 fifty /kernel: sio0: type 16550A
Jun 22 13:09:54 fifty /kernel: sio1 at port 0x2f8-0x2ff irq 3 on isa0
Jun 22 13:09:54 fifty /kernel: sio1: type 16550A
Jun 22 13:09:54 fifty /kernel: ppc0: parallel port not found.
Jun 22 13:09:54 fifty /kernel: APIC_IO: Testing 8254 interrupt delivery
Jun 22 13:09:54 fifty /kernel: APIC_IO: routing 8254 via IOAPIC #0 intpin 2
Jun 22 13:09:54 fifty /kernel: ipfw2 initialized, divert disabled, rule-based forwarding enabled, default to deny, logging unlimited
Jun 22 13:09:54 fifty /kernel: SMP: AP CPU #1 Launched!
Jun 22 13:09:54 fifty /kernel: SMP: AP CPU #3 Launched!
Jun 22 13:09:54 fifty /kernel: SMP: AP CPU #2 Launched!
Jun 22 13:09:54 fifty /kernel: acd0: DVD-ROM <DV-28E-C> at ata1-master PIO4
Jun 22 13:09:54 fifty /kernel: Waiting 15 seconds for SCSI devices to settle
Jun 22 13:09:54 fifty /kernel: amrd0: <LSILogic MegaRAID logical drive> on amr0
Jun 22 13:09:54 fifty /kernel: amrd0: 140006MB (286732288 sectors) RAID 1 (optimal)
Jun 22 13:09:54 fifty /kernel: pass0 at amr0 bus 0 target 6 lun 0
Jun 22 13:09:54 fifty /kernel: pass0: <SDR GEM318P 1> Fixed Processor SCSI-2 device
Jun 22 13:09:54 fifty /kernel: Mounting root from ufs:/dev/amrd0s2a
Jun 22 13:09:54 fifty /kernel: WARNING: / was not properly dismounted
Jun 22 13:11:17 fifty /kernel: bge0: gigabit link up
Jun 22 14:00:01 fifty newsyslog[62771]: logfile turned over due to size>1K
Jun 22 14:00:12 fifty mxsyslog[60040] logfile enrolled as /server/ftp/log/kernel.4


>How-To-Repeat:
	complex.  see above
>Fix:

	


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050701015122.D587BA9B6>