From owner-freebsd-current@FreeBSD.ORG Thu Oct 16 15:58:58 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1B47616A4B3; Thu, 16 Oct 2003 15:58:58 -0700 (PDT) Received: from odin.ac.hmc.edu (Odin.AC.HMC.Edu [134.173.32.75]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1BB0443FD7; Thu, 16 Oct 2003 15:58:55 -0700 (PDT) (envelope-from brdavis@odin.ac.hmc.edu) Received: from odin.ac.hmc.edu (IDENT:brdavis@localhost.localdomain [127.0.0.1]) by odin.ac.hmc.edu (8.12.9/8.12.3) with ESMTP id h9GMwqIp007036; Thu, 16 Oct 2003 15:58:52 -0700 Received: (from brdavis@localhost) by odin.ac.hmc.edu (8.12.9/8.12.3/Submit) id h9GMwqK4007035; Thu, 16 Oct 2003 15:58:52 -0700 Date: Thu, 16 Oct 2003 15:58:52 -0700 From: Brooks Davis To: current@freebsd.org Message-ID: <20031016225852.GA5253@Odin.AC.HMC.Edu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="NzB8fVQJ5HfG6fxh" Content-Disposition: inline User-Agent: Mutt/1.5.4i X-Virus-Scanned: by amavisd-milter (http://amavis.org/) on odin.ac.hmc.edu Subject: reliable panics in arstrategy X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Oct 2003 22:58:58 -0000 --NzB8fVQJ5HfG6fxh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I've got four dual Xeon servers that I can reliably panic under disk load. All of them have Promise ATA Raid controlers running in RAID1 mode. They consistantly panic in arstrategy if I run something like a CVS checkout of ports. The panic message and ddb backtrace are below as is the dmesg. The kernel is the SMP kernel. I've tried to obtain a crash dump, but "call dumpsys" just dumps me right back into the same panic so I'm hoping this is something you can reproduce. Please let me know if you need more information or if I need to try and figure out some way to run gdb on these boxes. Thanks, Brooks Fatal trap 12: page fault while in kernel mode cpuid =3D 0; lapic.id =3D 00000000 fault virtual address =3D 0xa6ea70f4 fault code =3D supervisor write, page not present instruction pointer =3D 0x8:0xc04dc3b9 stack pointer =3D 0x10:0xe0469bc4 frame pointer =3D 0x10:0xe0469c54 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, def32 1, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 4 (g_down) kernel: type 12 trap, code=3D0 Stopped at arstrategy+0x939: movl %eax,0x24(%ebx,%ecx,8) db> where arstrategy(c7480750,0,c084365f,5c,0) at arstrategy+0x939 g_disk_start(c7470a20,0,c0843c0f,164,a) at g_disk_start+0x1a6 g_io_schedule_down(c29b1000,2,c0843e31,6e,c06036b0) at g_io_schedule_down+0x1ac g_down_procbody(0,e0469d48,c0845bd4,314,ffffffff) at g_down_procbody+0x48 fork_exit(c06036b0,0,e0469d48) at fork_exit+0xcf fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip =3D 0, esp =3D 0xe0469d7c, ebp =3D 0 --- db> Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.1-CURRENT #2: Wed Oct 15 05:44:42 PDT 2003 root@nbboard.aero.org:/usr/obj/usr/src/sys/SMP Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a77000. Preloaded elf module "/boot/kernel/acpi.ko" at 0xc0a770a8. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2392.95-MHz 686-class CPU) Origin =3D "GenuineIntel" Id =3D 0xf27 Stepping =3D 7 Features=3D0xbfebfbff Hyperthreading: 2 logical CPUs real memory =3D 1073676288 (1023 MB) avail memory =3D 1033580544 (985 MB) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 24 pins in IOAPIC #1 Programming 24 pins in IOAPIC #2 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): apic id: 0, version: 0x00050014, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00050014, at 0xfee00000 cpu2 (AP): apic id: 6, version: 0x00050014, at 0xfee00000 cpu3 (AP): apic id: 7, version: 0x00050014, at 0xfee00000 io0 (APIC): apic id: 8, version: 0x00178020, at 0xfec00000 io1 (APIC): apic id: 9, version: 0x00178020, at 0xfec81000 io2 (APIC): apic id: 10, version: 0x00178020, at 0xfec81400 Pentium Pro MTRR support enabled ACPI-0660: *** Warning: Type override - [DEB_] had invalid type (Intege= r) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [MLIB] had invalid type (Intege= r) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [DATA] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [SIO_] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [LEDP] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [GPEN] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [GPST] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [WUES] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [WUSE] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [SBID] had invalid type (String= ) for Scope operator, changed to (Scope) ACPI-0660: *** Warning: Type override - [SWCE] had invalid type (String= ) for Scope operator, changed to (Scope) npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard ACPI-1287: *** Error: Method execution failed [\\_SB_.PCI0.SBRG.EC0_._R= EG] (Node 0xc6993c20), AE_NOT_EXIST acpi0: Could not initialise SystemIO handler: AE_NOT_EXIST device_probe_and_attach: acpi0 attach returned 6 pcibios: BIOS version 2.10 Using $PIR table, 19 entries at 0xc00f3060 pcib0: at pcibus 0 on motherboard pci0: on pcib0 IOAPIC #0 intpin 16 -> irq 2 IOAPIC #0 intpin 19 -> irq 16 pci0: at device 0.1 (no driver attached) pcib1: at device 3.0 on pci0 pci2: on pcib1 pci2: at device 28.0 (no driver att= ached) pcib2: at device 29.0 on pci2 pci4: on pcib2 pci2: at device 30.0 (no driver att= ached) pcib3: at device 31.0 on pci2 pci3: on pcib3 IOAPIC #1 intpin 6 -> irq 18 IOAPIC #1 intpin 7 -> irq 19 em0: port 0x2040-0= x207f mem 0xfeac0000-0xfeadffff irq 18 at device 7.0 on pci3 em0: Speed:N/A Duplex:N/A em1: port 0x2000-0= x203f mem 0xfeae0000-0xfeafffff irq 19 at device 7.1 on pci3 em1: Speed:N/A Duplex:N/A pci0: at device 3.1 (no driver attached) uhci0: port 0x3020-0x303f i= rq 2 at device 29.0 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x3000-0x301f i= rq 16 at device 29.1 on pci0 usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered pcib4: at device 30.0 on pci0 pci1: on pcib4 IOAPIC #0 intpin 18 -> irq 20 atapci0: port 0x1420-0x142f,0x140c-0x= 140f,0x1410-0x1417,0x1408-0x140b,0x1400-0x1407 mem 0xfe6e0000-0xfe6e3fff ir= q 20 at device 2.0 on pci1 atapci0: [MPSAFE] ata2: at 0x1400 on atapci0 ata2: [MPSAFE] ata3: at 0x1410 on atapci0 ata3: [MPSAFE] pci1: at device 12.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci1: port 0x3a0-0x3af,0-0x3,0-0x7,0-0x3= ,0-0x7 irq 0 at device 31.1 on pci0 ata0: at 0x1f0 irq 14 on atapci1 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci1 ata1: [MPSAFE] pci0: at device 31.3 (no driver attached) orm0: