From owner-freebsd-bugs@FreeBSD.ORG Fri Jul 1 02:00:35 2005 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7797C16A41C for ; Fri, 1 Jul 2005 02:00:35 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3F15A43D49 for ; Fri, 1 Jul 2005 02:00:35 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.3/8.13.3) with ESMTP id j6120ZTw077892 for ; Fri, 1 Jul 2005 02:00:35 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.3/8.13.1/Submit) id j6120Z6X077891; Fri, 1 Jul 2005 02:00:35 GMT (envelope-from gnats) Resent-Date: Fri, 1 Jul 2005 02:00:35 GMT Resent-Message-Id: <200507010200.j6120Z6X077891@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Chris Gabe Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9C2E316A41C for ; Fri, 1 Jul 2005 01:51:24 +0000 (GMT) (envelope-from chris@borderware.com) Received: from mail.borderware.com (mail.borderware.com [207.236.65.231]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0390043D49 for ; Fri, 1 Jul 2005 01:51:23 +0000 (GMT) (envelope-from chris@borderware.com) Message-Id: <20050701015122.D587BA9B6@santana.borderware.com> Date: Thu, 30 Jun 2005 21:51:22 -0400 (EDT) From: Chris Gabe To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Subject: kern/82846: Kernel crash in 5.4 with SMP,PAE X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Chris Gabe List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2005 02:00:35 -0000 >Number: 82846 >Category: kern >Synopsis: Kernel crash in 5.4 with SMP,PAE >Confidential: no >Severity: critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Jul 01 02:00:34 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Chris Gabe >Release: FreeBSD 5.4 i386 >Organization: Borderware >Environment: System: FreeBSD santana.borderware.com 4.7-RELEASE-p20 FreeBSD 4.7-RELEASE-p20 #1: Fri Sep 26 13:30:29 EDT 2003 root@santana.borderware.com:/usr/obj/usr/src/sys/SANTANA i386 >Description: Hello, I've got a kernel crash on a Sun V40Z quad CPU, with FreeBSD 5.4 SMP, PAE (and kernel debugging), 8GB ram. It happens every few hours. System is not using a lot of memory at that time, but it's usually after accessing over 4GB of files, in separate chunks to a total of only a few MB of user memory. I've hand transcribed the kernel trace below, and I haven't got the dmesg right now but a kernel log file from a 4.10 build we previously ran on the same hardware shows the basic idea. An LSI RAID controller, mirrored/striped SCSI hard drives. We're just wondering what direction to head with this. Any advice? Add more debugging, get a full crash dump, submit to something/someone, change kernel config option, sync to driver that has a fix for this (that would be a good one). hand transcribed kernel trace: kdb_enter panic lockmgr(ca71ce14,6,ca71cd68,0,f0147a1c) + 0x421 vop_stdunlock(<5 addresses>) + 1f vop_defaultop(<4 addresses>,1000) + 13 spec_vnoperate(didn't transcribe any more) + 13 spec_write 64 spec_vnoperate 13 vnode_pager_generic_putpages 224 vop_stdputpages 1a vop_defaultop 13 spec_vnoperate 13 vnode_pager_putpages 8a vm_pageout_flush cb vm_pageout_clean 2a1 vm_pageout_scan 706 vm_pageout 312 fork_exit 75 fork_trampoline 8 trap 0x1 eip=0, esp = 0xf0147d7c, ebp = 0 The kernel boot log file from 4.10 (sorry, I could get 5.4 dmesg but not until end of next week): devices amr, mpt perhaps of extra relevance(?) Jun 22 13:00:00 fifty newsyslog[11157]: logfile turned over due to size>1K Jun 22 13:09:53 fifty /kernel: Copyright 1998-2004 BorderWare Technologies Inc. All rights reserved. Jun 22 13:09:53 fifty /kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jun 22 13:09:53 fifty /kernel: The Regents of the University of California. All rights reserved. Jun 22 13:09:53 fifty /kernel: S-CORE 8.00 #14: Mon Jun 13 09:27:22 EDT 2005 Jun 22 13:09:53 fifty /kernel: support@borderware.com:/sys/compile/S-CORE_SMP Jun 22 13:09:53 fifty /kernel: Timecounter "i8254" frequency 1193182 Hz Jun 22 13:09:53 fifty /kernel: CPU: AMD Opteron(tm) Processor 850 (2391.27-MHz 686-class CPU) Jun 22 13:09:53 fifty /kernel: Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 Jun 22 13:09:53 fifty /kernel: Features=0x78bfbff Jun 22 13:09:53 fifty /kernel: AMD Features=0xe0500000<,AMIE,,DSP,3DNow!> Jun 22 13:09:53 fifty /kernel: real memory = 3824615424 (3734976K bytes) Jun 22 13:09:53 fifty /kernel: avail memory = 3724136448 (3636852K bytes) Jun 22 13:09:53 fifty /kernel: Programming 24 pins in IOAPIC #0 Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 2 -> irq 0 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #1 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #2 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #3 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #4 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #5 Jun 22 13:09:53 fifty /kernel: Programming 4 pins in IOAPIC #6 Jun 22 13:09:53 fifty /kernel: FreeBSD/SMP: Multiprocessor motherboard: 4 CPUs Jun 22 13:09:53 fifty /kernel: cpu0 (BSP): apic id: 0, version: 0x00040010, at 0xfee00000 Jun 22 13:09:53 fifty /kernel: cpu1 (AP): apic id: 1, version: 0x00040010, at 0xfee00000 Jun 22 13:09:53 fifty /kernel: cpu2 (AP): apic id: 2, version: 0x00040010, at 0xfee00000 Jun 22 13:09:53 fifty /kernel: cpu3 (AP): apic id: 3, version: 0x00040010, at 0xfee00000 Jun 22 13:09:53 fifty /kernel: io0 (APIC): apic id: 4, version: 0x00170011, at 0xfec00000 Jun 22 13:09:53 fifty /kernel: io1 (APIC): apic id: 5, version: 0x00030011, at 0xe4000000 Jun 22 13:09:53 fifty /kernel: io2 (APIC): apic id: 6, version: 0x00030011, at 0xe4001000 Jun 22 13:09:53 fifty /kernel: io3 (APIC): apic id: 7, version: 0x00030011, at 0xe5d01000 Jun 22 13:09:53 fifty /kernel: io4 (APIC): apic id: 8, version: 0x00030011, at 0xe5d03000 Jun 22 13:09:53 fifty /kernel: io5 (APIC): apic id: 9, version: 0x00030011, at 0xe5d05000 Jun 22 13:09:53 fifty /kernel: io6 (APIC): apic id: 10, version: 0x00030011, at 0xe5d07000 Jun 22 13:09:53 fifty /kernel: Preloaded elf kernel "kernel" at 0xc0455000. Jun 22 13:09:53 fifty /kernel: Preloaded elf module "splash_bmp.ko" at 0xc045509c. Jun 22 13:09:53 fifty /kernel: Preloaded splash_image_data "/boot/splash.bmp" at 0xc0455140. Jun 22 13:09:53 fifty /kernel: Pentium Pro MTRR support enabled Jun 22 13:09:53 fifty /kernel: md0: Malloc disk Jun 22 13:09:53 fifty /kernel: Using $PIR table, 24 entries at 0xc00fde40 Jun 22 13:09:53 fifty /kernel: npx0: on motherboard Jun 22 13:09:53 fifty /kernel: npx0: INT 16 interface Jun 22 13:09:53 fifty /kernel: pcib0: on motherboard Jun 22 13:09:53 fifty /kernel: pci0: on pcib0 Jun 22 13:09:53 fifty /kernel: pcib16: at device 6.0 on pci0 Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 19 -> irq 2 Jun 22 13:09:53 fifty /kernel: IOAPIC #0 intpin 17 -> irq 16 Jun 22 13:09:53 fifty /kernel: pci1: on pcib16 Jun 22 13:09:53 fifty /kernel: pci1: at 0.0 irq 2 Jun 22 13:09:53 fifty /kernel: pci1: at 0.1 irq 2 Jun 22 13:09:53 fifty /kernel: pci1: at 5.0 irq 16 Jun 22 13:09:53 fifty /kernel: isab0: at device 7.0 on pci0 Jun 22 13:09:53 fifty /kernel: isa0: on isab0 Jun 22 13:09:54 fifty /kernel: atapci0: port 0x1000-0x100f at device 7.1 on pci0 Jun 22 13:09:54 fifty /kernel: ata0: at 0x1f0 irq 14 on atapci0 Jun 22 13:09:54 fifty /kernel: ata1: at 0x170 irq 15 on atapci0 Jun 22 13:09:54 fifty /kernel: chip0: at device 7.3 on pci0 Jun 22 13:09:54 fifty /kernel: pcib17: at device 10.0 on pci0 Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 1 -> irq 17 Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 2 -> irq 18 Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 3 -> irq 19 Jun 22 13:09:54 fifty /kernel: pci2: on pcib17 Jun 22 13:09:54 fifty /kernel: bge0: mem 0xe5800000-0xe580ffff irq 17 at device 2.0 on pci2 Jun 22 13:09:54 fifty /kernel: bge0: Ethernet address: 00:09:3d:00:d4:e1 Jun 22 13:09:54 fifty /kernel: miibus0: on bge0 Jun 22 13:09:54 fifty /kernel: brgphy0: on miibus0 Jun 22 13:09:54 fifty /kernel: brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto Jun 22 13:09:54 fifty /kernel: bge1: mem 0xe5810000-0xe581ffff irq 18 at device 3.0 on pci2 Jun 22 13:09:54 fifty /kernel: bge1: Ethernet address: 00:09:3d:00:d4:e2 Jun 22 13:09:54 fifty /kernel: miibus1: on bge1 Jun 22 13:09:54 fifty /kernel: brgphy1: on miibus1 Jun 22 13:09:54 fifty /kernel: brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto Jun 22 13:09:54 fifty /kernel: mpt0: port 0x2000-0x20ff mem 0xe5820000-0xe582ffff,0xe5830000-0xe583ffff irq 19 at device 4.0 on pci2 Jun 22 13:09:54 fifty /kernel: pcib18: at device 5.0 on pci2 Jun 22 13:09:54 fifty /kernel: IOAPIC #1 intpin 0 -> irq 20 Jun 22 13:09:54 fifty /kernel: pci3: on pcib18 Jun 22 13:09:54 fifty /kernel: amr0: mem 0xe5900000-0xe597ffff,0xe5c00000-0xe5c0ffff irq 20 at device 0.0 on pci3 Jun 22 13:09:54 fifty /kernel: amr0: Firmware 413G, BIOS H414, 128MB RAM Jun 22 13:09:54 fifty /kernel: pci0: (vendor=0x1022, dev=0x7451) at 10.1 Jun 22 13:09:54 fifty /kernel: pcib19: at device 11.0 on pci0 Jun 22 13:09:54 fifty /kernel: pci4: on pcib19 Jun 22 13:09:54 fifty /kernel: pci0: (vendor=0x1022, dev=0x7451) at 11.1 Jun 22 13:09:54 fifty /kernel: pcib1: on motherboard Jun 22 13:09:54 fifty /kernel: pci5: on pcib1 Jun 22 13:09:54 fifty /kernel: pcib2: on motherboard Jun 22 13:09:54 fifty /kernel: pci6: on pcib2 Jun 22 13:09:54 fifty /kernel: pcib3: on motherboard Jun 22 13:09:54 fifty /kernel: pci7: on pcib3 Jun 22 13:09:54 fifty /kernel: pcib4: on motherboard Jun 22 13:09:54 fifty /kernel: pci8: on pcib4 Jun 22 13:09:54 fifty /kernel: pcib5: on motherboard Jun 22 13:09:54 fifty /kernel: pci9: on pcib5 Jun 22 13:09:54 fifty /kernel: pcib6: on motherboard Jun 22 13:09:54 fifty /kernel: pci10: on pcib6 Jun 22 13:09:54 fifty /kernel: pcib7: on motherboard Jun 22 13:09:54 fifty /kernel: pci11: on pcib7 Jun 22 13:09:54 fifty /kernel: pcib8: on motherboard Jun 22 13:09:54 fifty /kernel: pci12: on pcib8 Jun 22 13:09:54 fifty /kernel: pcib9: on motherboard Jun 22 13:09:54 fifty /kernel: pci13: on pcib9 Jun 22 13:09:54 fifty /kernel: pcib10: on motherboard Jun 22 13:09:54 fifty /kernel: pci14: on pcib10 Jun 22 13:09:54 fifty /kernel: pcib11: on motherboard Jun 22 13:09:54 fifty /kernel: pci15: on pcib11 Jun 22 13:09:54 fifty /kernel: pcib12: on motherboard Jun 22 13:09:54 fifty /kernel: pci16: on pcib12 Jun 22 13:09:54 fifty /kernel: pcib13: on motherboard Jun 22 13:09:54 fifty /kernel: pci17: on pcib13 Jun 22 13:09:54 fifty /kernel: pcib14: on motherboard Jun 22 13:09:54 fifty /kernel: pci18: on pcib14 Jun 22 13:09:54 fifty /kernel: pcib15: on motherboard Jun 22 13:09:54 fifty /kernel: pci19: on pcib15 Jun 22 13:09:54 fifty /kernel: orm0: