Date: Wed, 25 Jun 2008 13:27:52 -0500 From: Reid Linnemann <lreid@cs.okstate.edu> To: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org> Subject: READ_DMA timeouts, etc. on FreeBSD 7-STABLE SATA Message-ID: <48628E28.7080004@cs.okstate.edu>
next in thread | raw e-mail | index | archive | help
Hi guys, I'm running 7-STABLE, last synced early June (June 7 I think). I have two SATA disks, identical 160G Western Digital WD1600AAJS on a SiS 180 SATA controller that are gmirrored, and the mirror provides all of my individual filesystems. After I built the mirror in single user mode and rebooted, I started getting DMA errors such as: Jun 21 11:56:28 hautlos kernel: ad6: TIMEOUT - READ_DMA retrying (1 retry left) LBA=2830976 Jun 21 11:56:28 hautlos kernel: ad6: TIMEOUT - READ_DMA retrying (1 retry left) LBA=2901888 Jun 21 11:56:28 hautlos kernel: ad6: TIMEOUT - READ_DMA retrying (1 retry left) LBA=2995328 The LBA is apparently random. Most of the time this just makes the machine crawl and is annoying, but if, say, a filesystem were removed uncleanly from a power failure, the combined activity of the mirror rebuilding and the fsck cause much more disconcerting errors, eg: Jun 21 11:48:46 hautlos kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jun 21 11:49:02 hautlos kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jun 21 11:49:02 hautlos kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Jun 21 11:49:02 hautlos kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jun 21 11:49:02 hautlos kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly Jun 21 11:49:02 hautlos kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=196200751 Jun 21 11:49:02 hautlos kernel: ad6: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=196442127 But, now for the weird part... I tried booting in single user mode, disabling DMA, and disabling ACPI, to no avail. Soft boot, hard boot, doesn't matter. But - if I power the machine down, cut power to the power supply, and cycle the remaining juice through the system by hitting the ATX power on, and then boot up, the DMA errors completely or nearly completely vanish. Since I did this on Jun 21 I have logged only 2 READ_DMA timeouts: messages:Jun 22 03:02:15 hautlos kernel: ad4: TIMEOUT - READ_DMA retrying (1 retry left) LBA=56884207 messages:Jun 24 10:52:41 hautlos kernel: ad4: TIMEOUT - READ_DMA retrying (1 retry left) LBA=243514511 Does anyone have any ideas? I've googled but can't find any solutions. I'm not currently subscribed to stable@, so please cc: me in responses. My uname -a and dmesg follows. FreeBSD hautlos 7.0-STABLE FreeBSD 7.0-STABLE #7: Sat Jun 7 10:46:48 CDT 2008 root@:/usr/obj/usr/src/sys/HAUTLOS i386 Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-STABLE #7: Sat Jun 7 10:46:48 CDT 2008 root@:/usr/obj/usr/src/sys/HAUTLOS Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 Processor 3000+ (1999.44-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x20fc2 Stepping = 2 Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2> Features2=0x1<SSE3> AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow!+,3DNow!> AMD Features2=0x1<LAHF> real memory = 1073676288 (1023 MB) avail memory = 1037291520 (989 MB) ACPI APIC Table: <AWARD AWRDACPI> ioapic0 <Version 1.4> irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: <AWARD AWRDACPI> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, 3fef0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: <ACPI CPU> on acpi0 acpi_button0: <Power Button> on acpi0 acpi_button1: <Sleep Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff,0x480-0x48f,0x1000-0x10df,0x10e0-0x10ff on acpi0 pci0: <ACPI PCI bus> on pcib0 agp0: <SiS 755 host to AGP bridge> on hostb0 pcib1: <PCI-PCI bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 vgapci0: <VGA-compatible display> port 0xd000-0xd0ff mem 0xd0000000-0xd7ffffff,0xe8020000-0xe802ffff irq 16 at device 0.0 on pci1 vgapci1: <VGA-compatible display> mem 0xd8000000-0xdfffffff,0xe8030000-0xe803ffff at device 0.1 on pci1 isab0: <PCI-ISA bridge> at device 2.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <SiS 964 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x4000-0x400f at device 2.5 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci0 ata1: [ITHREAD] pcm0: <SiS 7012> port 0xe000-0xe0ff,0xe100-0xe17f irq 18 at device 2.7 on pci0 pcm0: [ITHREAD] pcm0: <Avance Logic ALC655 AC97 Codec> ohci0: <SiS 5571 USB controller> mem 0xe8124000-0xe8124fff irq 20 at device 3.0 on pci0 ohci0: [GIANT-LOCKED] ohci0: [ITHREAD] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: <SiS 5571 USB controller> on ohci0 usb0: USB revision 1.0 uhub0: <SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 3 ports with 3 removable, self powered ohci1: <SiS 5571 USB controller> mem 0xe8120000-0xe8120fff irq 21 at device 3.1 on pci0 ohci1: [GIANT-LOCKED] hci1: [ITHREAD] usb1: OHCI version 1.0, legacy support usb1: SMM does not respond, resetting usb1: <SiS 5571 USB controller> on ohci1 usb1: USB revision 1.0 uhub1: <SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 uhub1: 3 ports with 3 removable, self powered ohci2: <SiS 5571 USB controller> mem 0xe8121000-0xe8121fff irq 22 at device 3.2 on pci0 ohci2: [GIANT-LOCKED] ohci2: [ITHREAD] usb2: OHCI version 1.0, legacy support usb2: SMM does not respond, resetting usb2: <SiS 5571 USB controller> on ohci2 usb2: USB revision 1.0 uhub2: <SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2 uhub2: 2 ports with 2 removable, self powered ehci0: <EHCI (generic) USB 2.0 controller> mem 0xe8122000-0xe8122fff irq 23 at device 3.3 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb3: EHCI version 1.0 usb3: companion controllers, 3 ports each: usb0 usb1 usb2 usb3: <EHCI (generic) USB 2.0 controller> on ehci0 usb3: USB revision 2.0 uhub3: <SiS EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb3 uhub3: 8 ports with 8 removable, self powered umass0: <Apple iPod, class 0/0, rev 2.00/0.02, addr 2> on uhub3 sis0: <SiS 900 10/100BaseTX> port 0xe200-0xe2ff mem 0xe8123000-0xe8123fff irq 19 at device 4.0 on pci0 miibus0: <MII bus> on sis0 rlphy0: <RTL8201L 10/100 media interface> PHY 1 on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto sis0: Ethernet address: 00:14:2a:68:cf:ff sis0: [ITHREAD] atapci1: <SiS 180 SATA150 controller> port 0xe300-0xe307,0xe400-0xe403,0xe500-0xe507,0xe600-0xe603,0xe700-0xe70f irq 17 at device 5.0 on pci0 atapci1: [ITHREAD] ata2: <ATA channel 0> on atapci1 ata2: [ITHREAD] ata3: <ATA channel 1> on atapci1 ata3: [ITHREAD] ahc0: <Adaptec 2930CU SCSI adapter> port 0xe800-0xe8ff mem 0xe8125000-0xe8125fff irq 16 at device 12.0 on pci0 ahc0: [ITHREAD] aic7860: Ultra Single Channel A, SCSI Id=7, 3/253 SCBs acpi_tz0: <Thermal Zone> on acpi0 fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xccfff,0xd0000-0xd7fff,0xd8000-0xd87ff pnpid ORM0000 on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse, device ID 3 ppc0: <Parallel port> at port 0x378-0x37f on isa0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/16 bytes threshold ppbus0: <Parallel port bus> on ppc0 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Polled port ppi0: <Parallel I/O> on ppbus0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ukbd0: <vendor 0x05af USB Keyboard, class 0/0, rev 1.10/1.30, addr 2> on uhub0 kbd2 at ukbd0 uhid0: <vendor 0x05af USB Keyboard, class 0/0, rev 1.10/1.30, addr 2> on uhub0 ugen0: <American Power Conversion Back-UPS XS 900 FW:830.E6 .D USB FW:E6, class 0/0, rev 1.10/1.06, ad dr 2> on uhub2 Timecounter "TSC" frequency 1999440640 Hz quality 800 Timecounters tick every 1.000 msec acd0: CDRW <Memorex 52MAX 325216AJv2/RW$5> at ata0-master UDMA33 ad4: 152627MB <WDC WD1600AAJS-00PSA0 05.06H05> at ata2-master SATA150 ad6: 152627MB <WDC WD1600AAJS-00PSA0 05.06H05> at ata3-master SATA150 GEOM_MIRROR: Device mirror/gm0 launched (2/2). Waiting 2 seconds for SCSI devices to settle da0 at umass-sim0 bus 0 target 0 lun 0 da0: <Apple iPod 1.62> Removable Direct Access SCSI-0 device da0: 40.000MB/s transfers da0: 1936MB (991232 2048 byte sectors: 255H 63S/T 61C) GEOM_LABEL: Label for provider da0 is label/ipod. GEOM_LABEL: Label for provider da0s2 is msdosfs/IPOD. Trying to mount root from ufs:/dev/mirror/gm0s1a drm0: <ATI Radeon AD 9500> on vgapci0 info: [drm] AGP at 0xe0000000 128MB info: [drm] Initialized radeon 1.25.0 20060524 info: [drm] Setting GART location based on new memory map info: [drm] Loading R300 Microcode info: [drm] writeback test succeeded in 1 usecs drm0: [ITHREAD] powernow0: <Cool`n'Quiet K8> on cpu0
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48628E28.7080004>