Date: Sat, 21 Oct 2006 19:27:22 -0500 From: Jon Passki <jon.passki@hursk.com> To: freebsd-stable@freebsd.org, sos@freebsd.org Cc: Matthew Dettinger <matt.dettinger@hursk.com> Subject: Panic on "DOH! ata_alloc_request failed!" Message-ID: <CA4DDDA4-77C7-4EBD-8954-1A6F1EDDF040@hursk.com>
index | next in thread | raw e-mail
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hey all, (I'm off list, please include me in any replies) (Søren, please let me know if you do not want to be emailed in the future directly! You seem to be the ATA RAID FreeBSD goto guy. Apologies if you did not want to be solicited). I just received a panic w/ this in /var/log/messages (uname and dmesg at the end of the email / book): Oct 21 04:17:05 prometheus kernel: DOH! ata_alloc_composite failed! The system would have been performing a weekly dump (live snapshot) at the time. The motherboard is a SuperMicro P4SCi [1], which uses an Intel 6300ESB onboard SATA w/ RAID 0/1 support (Adaptec). I have one RAID volume created, ar0 as a RAID 1 between two 300GB disks. I previously have seen this error when we were importing a MySQL database on the box w/ heavy disk I/O. Oct 9 21:41:23 prometheus kernel: DOH! ata_alloc_request failed! Oct 9 21:41:29 prometheus kernel: FAILURE - out of memory in ata_raid_init_requ est Oct 9 21:41:29 prometheus last message repeated 2 times Oct 9 21:41:29 prometheus kernel: g_vfs_done():ar0s1e[WRITE (offset=108675514368 , length=16384)]error = 5 Oct 9 21:41:29 prometheus kernel: g_vfs_done():ar0s1e[WRITE (offset=108486344704 , length=16384)]error = 5 Oct 9 21:41:29 prometheus kernel: g_vfs_done():ar0s1e[WRITE (offset=108486623232 , length=16384)]error = 5 Oct 9 23:01:17 prometheus kernel: FAILURE - out of memory in ata_raid_init_requ est Oct 9 23:01:17 prometheus kernel: FAILURE - out of memory in ata_raid_init_requ est Oct 9 23:01:17 prometheus kernel: g_vfs_done():ar0s1e[WRITE (offset=118889250816 , length=16384)]error = 5 Oct 9 23:01:17 prometheus kernel: g_vfs_done():ar0s1e[WRITE (offset=118889742336 , length=16384)]error = 5 I've seen two other similar issues on the mailing lists: http://lists.freebsd.org/pipermail/freebsd-amd64/2006-August/008770.html http://lists.freebsd.org/pipermail/freebsd-amd64/2006-April/008047.html http://lists.freebsd.org/pipermail/freebsd-stable/2005-November/ 019559.html The first two didn't seem to have any follow-ups. This is the only bug report with "ata_alloc_composite" returned in the query: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/89310 I restarted the box, manually verified the array using the BIOS utils (12 errors fixed, eek!), and had to run fsck manually due to an "UNEXPECTED SOFT UPDATE INCONSISTENCY" on /dev/ar0s1e. I had unreferenced files, too (yeah for backups). Since I didn't have a dumpon device, I don't have a core to post. Here's the grep of sysctl alluded to in the Nov. 2005 email above: sysctl -a | grep ^ata_ ata_composit: 196, 0, 0, 100, 3928 ata_request: 204, 0, 0, 76, 127512 How do I monitor or fix this? 'sync' or restart every so often? Any assistance would be appreciated! TIA, Jon [1] http://www.supermicro.com/products/motherboard/P4/E7210/P4SCi.cfm uname -a FreeBSD prometheus.int.hursk.com 6.1-RELEASE FreeBSD 6.1-RELEASE #0: Sun May 7 04:32:43 UTC 2006 root@opus.cse.buffalo.edu:/usr/obj/ usr/src/sys/GENERIC i386 (who believes in patches ;-) <dmesg afterboot=yes> Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.1-RELEASE #0: Sun May 7 04:32:43 UTC 2006 root@opus.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC ACPI APIC Table: <IntelR AWRDACPI> Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 2.80GHz (2795.24-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE ,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x4400<CNTX-ID,<b14>> Logical CPUs per core: 2 real memory = 1072562176 (1022 MB) avail memory = 1040633856 (992 MB) ioapic0: Changing APIC ID to 2 ioapic0 <Version 2.0> irqs 0-23 on motherboard ioapic1 <Version 2.0> irqs 24-47 on motherboard kbd1 at kbdmux0 acpi0: <IntelR AWRDACPI> on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 cpu0: <ACPI CPU> on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> at device 3.0 on pci0 pci1: <ACPI PCI bus> on pcib1 em0: <Intel(R) PRO/1000 Network Connection Version - 3.2.18> port 0xb000-0xb01f mem 0xf2300000-0xf231ffff irq 18 at device 1.0 on pci1 em0: Ethernet address: 00:30:48:84:01:22 pcib2: <ACPI PCI-PCI bridge> at device 28.0 on pci0 pci2: <ACPI PCI bus> on pcib2 pcib3: <PCI-PCI bridge> at device 1.0 on pci2 pci3: <PCI bus> on pcib3 sf0: <Adaptec ANA-62044 10/100BaseTX> port 0xc000-0xc0ff mem 0xf2000000-0xf207ffff irq 24 at device 4.0 on pci3 miibus0: <MII bus> on sf0 ukphy0: <Generic IEEE 802.3u media interface> on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto sf0: Ethernet address: 00:00:d1:ed:98:49 sf1: <Adaptec ANA-62044 10/100BaseTX> port 0xc100-0xc1ff mem 0xf2080000-0xf20fffff irq 25 at device 5.0 on pci3 miibus1: <MII bus> on sf1 ukphy1: <Generic IEEE 802.3u media interface> on miibus1 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto sf1: Ethernet address: 00:00:d1:ed:98:4a sf2: <Adaptec ANA-62044 10/100BaseTX> port 0xc200-0xc2ff mem 0xf2100000-0xf217ffff irq 26 at device 6.0 on pci3 miibus2: <MII bus> on sf2 ukphy2: <Generic IEEE 802.3u media interface> on miibus2 ukphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto sf2: Ethernet address: 00:00:d1:ed:98:4b sf3: <Adaptec ANA-62044 10/100BaseTX> port 0xc300-0xc3ff mem 0xf2180000-0xf21fffff irq 27 at device 7.0 on pci3 miibus3: <MII bus> on sf3 ukphy3: <Generic IEEE 802.3u media interface> on miibus3 ukphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto sf3: Ethernet address: 00:00:d1:ed:98:4c uhci0: <UHCI (generic) USB controller> port 0xe100-0xe11f irq 16 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: <UHCI (generic) USB controller> on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: <UHCI (generic) USB controller> port 0xe000-0xe01f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: <UHCI (generic) USB controller> on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered pci0: <base peripheral> at device 29.4 (no driver attached) pci0: <base peripheral, interrupt controller> at device 29.5 (no driver attached) ehci0: <Intel 6300ESB USB 2.0 controller> mem 0xf2400000-0xf24003ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb2: EHCI version 1.0 usb2: companion controllers, 2 ports each: usb0 usb1 usb2: <Intel 6300ESB USB 2.0 controller> on ehci0 usb2: USB revision 2.0 uhub2: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub2: 4 ports with 4 removable, self powered pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci4: <ACPI PCI bus> on pcib4 pci4: <display, VGA> at device 9.0 (no driver attached) em1: <Intel(R) PRO/1000 Network Connection Version - 3.2.18> port 0xd100-0xd13f mem 0xf1000000-0xf101ffff irq 19 at device 10.0 on pci4 em1: Ethernet address: 00:30:48:84:01:23 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel 6300ESB UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 31.1 on pci0 ata0: <ATA channel 0> on atapci0 ata1: <ATA channel 1> on atapci0 atapci1: <Intel 6300ESB SATA150 controller> port 0xe200-0xe207,0xe300-0xe303,0xe400-0xe407,0xe500-0xe503,0xe600-0xe60f irq 18 at device 31.2 on pci0 ata2: <ATA channel 0> on atapci1 ata3: <ATA channel 1> on atapci1 pci0: <serial bus, SMBus> at device 31.3 (no driver attached) acpi_tz0: <Thermal Zone> on acpi0 fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0: <Standard parallel printer port> port 0x378-0x37f,0x778-0x77b irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: <Parallel port bus> on ppc0 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xcbfff, 0xcc000-0xd07ff,0xd1000-0xd1fff,0xd2000-0xd2fff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 2795241476 Hz quality 800 Timecounters tick every 1.000 msec ad4: 305245MB <Seagate ST3320620AS 3.AAD> at ata2-master SATA150 ad6: 305245MB <Seagate ST3320620AS 3.AAD> at ata3-master SATA150 ar0: 305245MB <Adaptec HostRAID RAID1> status: READY ar0: disk0 READY (master) using ad4 at ata2-master ar0: disk1 READY (mirror) using ad6 at ata3-master [snipped] </dmesg> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (Darwin) iD8DBQFFOrrqZpJsLIS+QSIRAnUPAJ0YDTABguFtSuGOOeh0UcM+AwjMBgCcC9cW bWKbrtKMBFhMXDxYoiwk5Mw= =Dyaq -----END PGP SIGNATURE-----help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA4DDDA4-77C7-4EBD-8954-1A6F1EDDF040>
