Date: Tue, 14 Oct 2003 23:37:57 -0400 From: David Sze <dsze@alumni.uwaterloo.ca> To: freebsd-scsi@freebsd.org Subject: Dell PowerEdge 1750 and mpt Message-ID: <6.0.0.22.2.20031014232154.03a0b990@mail.distrust.net>
next in thread | raw e-mail | index | archive | help
Hi, I have 3 identically configured Dell PoweEdge 1750 servers (2 x 3.0GHz Xeon, 4GB RAM, 3 x 36GB U320 SCSI, LSI MPT/Fusion SCSI controller). They are running FreeBSD 4.8-RELEASE-p10, with the bge driver from RELENG_4 ported back. HyperThreading is enabled, and HTT is compiled into the kernel. A custom application is running on these servers that performs many small random reads and writes. All three servers spontaneously panic and reboot at random intervals, after an uptime ranging anywhere from tens of minutes to a couple days. The crashdump and backtrace are shown below - it appears that the problem is somewhere in the mpt driver, but I have no experience in this area, so any help that can be provided would be much appreciated. Output from dmesg follows the crashdump. If more information is required, let me know and I will provide it. /usr/src/sys/compile/KERNEL># gdb -k kernel.debug -c /var/crash/vmcore.0 GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf SMP 2 cpus IdlePTD at phsyical address 0x00349000 initial pcb at physical address 0x002bb7c0 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode mp_lock = 01000002; cpuid = 1; lapic.id = 06000000 fault virtual address = 0x8 fault code = supervisor read, page not present instruction pointer = 0x8:0x80171388 stack pointer = 0x10:0xdb3ebc7c frame pointer = 0x10:0xdb3ebc90 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 184 (sproxy) interrupt mask = cam <- SMP: XXX trap number = 12 panic: page fault mp_lock = 01000002; cpuid = 1; lapic.id = 06000000 boot() called on cpu#1 syncing disks... 1023 502 68 3 3 3 3 3 3 3 21 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 giving up on 3 buffers Uptime: 1h31m19s dumping to dev #da/0x20009, offset 133248 --- #0 dumpsys () at ../../kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) bt #0 dumpsys () at ../../kern/kern_shutdown.c:487 #1 0x8018c33b in boot (howto=256) at ../../kern/kern_shutdown.c:316 #2 0x8018c794 in poweroff_wait (junk=0x8028bef9, howto=-2144814641) at ../../kern/kern_shutdown.c:595 #3 0x8024a63c in trap_fatal (frame=0xdb3ebc3c, eva=8) at ../../i386/i386/trap.c:974 #4 0x8024a2cd in trap_pfault (frame=0xdb3ebc3c, usermode=0, eva=8) at ../../i386/i386/trap.c:867 #5 0x80249e6f in trap (frame={tf_fs = 24, tf_es = -1834221552, tf_ds = -1841823728, tf_edi = -1776680960, tf_esi = -1776680960, tf_ebp = -616645488, tf_isp = -616645528, tf_ebx = 0, tf_edx = 1811947648, tf_ecx = 0, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -2145971320, tf_cs = 8, tf_eflags = 66118, tf_esp = -1841812480, tf_ss = -1841812480}) at ../../i386/i386/trap.c:466 #6 0x80171388 in mpt_read_cfg_page (mpt=0x92382c00, PageAddress=0, hdr=0xdb3ebcc4) at ../../dev/mpt/mpt.c:576 #7 0x80174507 in mpt_action (sim=0x923867c0, ccb=0x961a0000) at ../../dev/mpt/mpt_freebsd.c:1311 #8 0x801215ce in xpt_action (start_ccb=0x961a0000) at ../../cam/cam_xpt.c:2949 #9 0x80125e35 in cam_periph_runccb (ccb=0x961a0000, error_routine=0, camflags=CAM_FLAG_NONE, sense_flags=17, ds=0x92a92a80) at ../../cam/cam_periph.c:822 #10 0x80129cd0 in passsendccb (periph=0x92a90f00, ccb=0x961a0000, inccb=0x93bb7400) at ../../cam/scsi/scsi_pass.c:797 #11 0x80129bfc in passioctl (dev=0x92a90980, cmd=3261076482, addr=0x93bb7400 "\001", flag=3, p=0xd244a400) at ../../cam/scsi/scsi_pass.c:714 #12 0x801c5b62 in spec_ioctl (ap=0xdb3ebde0) at ../../miscfs/specfs/spec_vnops.c:306 #13 0x801c588d in spec_vnoperate (ap=0xdb3ebde0) at ../../miscfs/specfs/spec_vnops.c:119 #14 0x80209349 in ufs_vnoperatespec (ap=0xdb3ebde0) at ../../ufs/ufs/ufs_vnops.c:2394 #15 0x801c2107 in vn_ioctl (fp=0x9633eb40, com=3261076482, data=0x93bb7400 "\001", p=0xd244a400) at vnode_if.h:429 #16 0x8019ba1e in ioctl (p=0xd244a400, uap=0xdb3ebf80) at ../../sys/file.h:178 #17 0x8024a96d in syscall2 (frame={tf_fs = 135725103, tf_es = 47, tf_ds = 2143223855, tf_edi = 136306688, tf_esi = 2143283856, tf_ebp = 2143284464, tf_isp = -616644652, tf_ebx = 2143283952, tf_edx = 0, tf_ecx = 0, tf_eax = 54, tf_trapno = 12, tf_err = 2, tf_eip = 135190204, tf_cs = 31, tf_eflags = 531, tf_esp = 2143283780, tf_ss = 47}) at ../../i386/i386/trap.c:1175 #18 0x8023805b in Xint0x80_syscall () cannot read proc at 0 (kgdb) Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.8-RELEASE-p10 #0: Tue Oct 14 16:55:25 EDT 2003 root@host.example.com:/usr/src/sys/compile/KERNEL Timecounter "i8254" frequency 1193182 Hz CPU: Intel(R) Xeon(TM) CPU 3.00GHz (2986.93-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf27 Stepping = 7 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Hyperthreading: 2 logical CPUs real memory = 4227792896 (4128704K bytes) avail memory = 4113051648 (4016652K bytes) Changing APIC ID for IO APIC #0 from 0 to 8 on chip Changing APIC ID for IO APIC #1 from 0 to 9 on chip Changing APIC ID for IO APIC #2 from 0 to 10 on chip Programming 16 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 16 pins in IOAPIC #1 Programming 16 pins in IOAPIC #2 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00050014, at 0xfee00000 cpu1 (AP): apic id: 6, version: 0x00050014, at 0xfee00000 io0 (APIC): apic id: 8, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 9, version: 0x000f0011, at 0xfec01000 io2 (APIC): apic id: 10, version: 0x000f0011, at 0xfec02000 Preloaded elf kernel "kernel" at 0x8032a000. Preloaded elf module "accf_data.ko" at 0x8032a09c. Pentium Pro MTRR support enabled Using $PIR table, 7 entries at 0x800fc4c0 npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Host to PCI bridge> on motherboard pci0: <PCI bus> on pcib0 pci0: <ATI Mach64-GR graphics accelerator> at 14.0 atapci0: <ServerWorks CSB5 ATA100 controller> port 0x8b0-0x8bf,0x374-0x377,0x170-0x177,0x3f4-0x3f7,0x1f0-0x1f7 at device 15.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: <OHCI USB controller> at 15.2 irq 11 isab0: <PCI to ISA bridge (vendor=1166 device=0225)> at device 15.3 on pci0 isa0: <ISA bus> on isab0 pcib1: <Host to PCI bridge> on motherboard pci1: <PCI bus> on pcib1 pcib2: <Host to PCI bridge> on motherboard IOAPIC #1 intpin 0 -> irq 2 IOAPIC #1 intpin 1 -> irq 5 pci2: <PCI bus> on pcib2 bge0: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2002> mem 0xfcf20000-0xfcf2ffff,0xfcf30000-0xfcf3ffff irq 2 at device 0.0 on pci2 bge0: Ethernet address: 00:06:5b:ef:9c:66 miibus0: <MII bus> on bge0 brgphy0: <BCM5704 10/100/1000baseTX PHY> on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2002> mem 0xfcf00000-0xfcf0ffff,0xfcf10000-0xfcf1ffff irq 5 at device 0.1 on pci2 bge1: Ethernet address: 00:06:5b:ef:9c:67 miibus1: <MII bus> on bge1 brgphy1: <BCM5704 10/100/1000baseTX PHY> on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto pcib3: <Host to PCI bridge> on motherboard pci3: <PCI bus> on pcib3 pcib4: <Host to PCI bridge> on motherboard IOAPIC #1 intpin 2 -> irq 7 IOAPIC #1 intpin 3 -> irq 13 pci4: <PCI bus> on pcib4 mpt0: <LSILogic 1030 Ultra4 Adapter> port 0xcc00-0xccff mem 0xfcd20000-0xfcd2ffff,0xfcd30000-0xfcd3ffff irq 7 at device 5.0 on pci4 mpt1: <LSILogic 1030 Ultra4 Adapter> port 0xc800-0xc8ff mem 0xfcd00000-0xfcd0ffff,0xfcd10000-0xfcd1ffff irq 13 at device 5.1 on pci4 pcib5: <Host to PCI bridge> on motherboard pci5: <PCI bus> on pcib5 pcib6: <ServerWorks host to PCI bridge(unknown chipset)> on motherboard pci6: <PCI bus> on pcib6 pcib7: <ServerWorks host to PCI bridge(unknown chipset)> on motherboard pci7: <PCI bus> on pcib7 orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9000-0xccfff,0xcd000-0xce7ff,0xec000-0xeffff on isa0 fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x100> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console APIC_IO: Testing 8254 interrupt delivery APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpin 2 APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0 ata1-slave: ATAPI identify retries exceeded ata1-master: simplex device, DMA on primary only SMP: AP CPU #1 Launched! acd0: CDROM <TEAC CD-ROM CD-224E> at ata1-master BIOSPIO Waiting 15 seconds for SCSI devices to settle pass3 at mpt0 bus 0 target 6 lun 0 pass3: <PE/PV 1x3 SCSI BP 1.1> Fixed Processor SCSI-2 device pass3: 3.300MB/s transfers da0 at mpt0 bus 0 target 0 lun 0 da0: <FUJITSU MAP3367NC 5605> Fixed Direct Access SCSI-3 device da0: 320.000MB/s transfers (160.000MHz, offset 127, 16bit), Tagged Queueing Enabled da0: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C) da2 at mpt0 bus 0 target 2 lun 0 da2: <FUJITSU MAP3367NC 5605> Fixed Direct Access SCSI-3 device da2: 320.000MB/s transfers (160.000MHz, offset 127, 16bit), Tagged Queueing Enabled da2: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C) da1 at mpt0 bus 0 target 1 lun 0 da1: <FUJITSU MAP3367NC 5605> Fixed Direct Access SCSI-3 device da1: 320.000MB/s transfers (160.000MHz, offset 127, 16bit), Tagged Queueing Enabled da1: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C) Mounting root from ufs:/dev/da0s1a
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6.0.0.22.2.20031014232154.03a0b990>