Date: Thu, 21 Apr 2005 12:30:30 -0600 From: Kendall Gifford <zettabyte@gmail.com> To: freebsd-hardware@freebsd.org Subject: ATA DMA Issues Resurfaced (READ_DMA TIMEOUT/FAILURE) Message-ID: <86ba954f05042111304e36b01c@mail.gmail.com>
index | next in thread | raw e-mail
Howdy. I'm not sure whether hardware or stable is the best list for this, but here is my problem. Any info, recommendations, or help will be greatly appreciated. I've got a server running 5-STABLE (updated/built Jan. 22, 2005). It has been running this kernel, a 5.3-RELEASE kernel, and other 5.x branch versions for the last ten or so months now. Previous to this, it was running 4.9-RELEASE. About ten months ago, when I switched from the 4.x branch to the 5.x branch, I immediately began experiencing WRITE_DMA ICRC errors durring disk activity at seemingly random times. At that time I posted to this list and questions the following message: http://groups-beta.google.com/group/mailing.freebsd.questions/browse_thread/thread/17fe5871d823f380/a16568320427152e?rnum=2#a16568320427152e The gist of the message and my current experience is that my hardware (drives, cables, motherboard controllers, etc.) is definately fine and that I've noticed others posting various, possibly-related issues both before and since I posted the above message. I basically ended up working around the problem by running atacontrol in a /usr/local/etc/rc.d/ script that set my drives to PIO4 mode. I then mostly forgot about the problem as everything has since worked fine--that is until just recently. About a week ago (around April 14, 2005) after performing some updates of some ports and configurations, I decided to perform a reboot (quite extranous, I know, but reassuring to verify that all scripts/configs are properly set up the way I want). Just as my system began starting local services, and just after it ran my custom /usr/local/etc/rc.d atacontrol script, I got the following error messages: <Screenshot> Master = PIO4 Slave = UDMA33 Master = PIO4 Slave = BIOSPIO ad0: TIMEOUT - READ_DMA retrying (2 retries left) LBA=146793208 ad0: FAILURE - READ_DMA timed out GEOM_VINUM: subdisk raid.p0.s0 is down GEOM_VINUM: plex raid.p0 is down Starting mysql. Fatal trap 12: page fault while in kernel mode fault virtual addess = 0xc fault code = supervisor read, page not present instruction pointer = 0x8:0xc04ba88f stack pointer = 0x10:0xd321dc6c frame pointer = 0x10:0xd321dc98 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL0, pres 1, def32 1, gran1 processor eflags = interrupt enabled, resume, IOPL= 0 current process = 4 (g_down) trap number = 12 panic: page fault Uptime: 28s </End Screenshot> This is the first time in ten months I've had issues switching to PIO4 mode during local service startup. I really am not quite sure what happened. Anyhow, I've since rebooted into single-user mode, brought my gvinum-mirror plex back up, and the usual stuff to manually bring my system up. But, I did have one attempt at doing this when I foolishly forgot to manually atacontrol my drives before trying to bring my gvinum plex back up. As it was restoring in the background, I remembered and unthinkingly ran atacontrol and again succeeded in bringing my system down in much the same manner as shown above (only this time with WRITE_DMA errors instead of READ_DMA errors). Anyhow, based on this experience, my two guesses as to the cause of my booting problem is that disk activity from starting the system is causing problems before my disks can be put fully in PIO4 mode (and timing is immaculate) or that the current state of things when atacontrol is executed causes problems. As you can see, I have no idea what the real problem is and wonder if any more info on this/these ata/dma problems is available. I wonder if I'd be better off moving to 4.11 until the root cause of these problems is found. Any help or information anyone? System Info: <Kernel Config> machine i386 cpu I686_CPU device npx device isa device pci device agp options VESA ident KERNEL maxusers 100 options SCHED_4BSD options COMPAT_43 options COMPAT_FREEBSD4 options SYSVSHM options SYSVSEM options SYSVMSG options KTRACE options INVARIANT_SUPPORT options INET device ether device loop device bpf device tun options IPFIREWALL options IPFIREWALL_VERBOSE options IPFIREWALL_VERBOSE_LIMIT=1000 options IPDIVERT options FFS options NFSCLIENT options NFSSERVER options CD9660 options FDESCFS options MSDOSFS options NTFS options NULLFS options PROCFS options PSEUDOFS options UDF options SOFTUPDATES options UFS_EXTATTR options UFS_EXTATTR_AUTOSTART options UFS_ACL options GEOM_BSD options GEOM_CONCAT options GEOM_GPT options GEOM_LABEL options GEOM_MBR options GEOM_MIRROR options GEOM_VOL options QUOTA device md device random device pty device snp options _KPOSIX_PRIORITY_SCHEDULING device atkbdc device atkbd device psm device vga device splash device sc options MAXCONS=16 options SC_HISTORY_SIZE=2000 options SC_TWOBUTTON_MOUSE options SC_KERNEL_CONS_ATTR=(FG_RED|BG_BLACK) options SC_KERNEL_CONS_REV_ATTR=(FG_BLACK|BG_RED) device ata device atadisk device ataraid device atapicd device atapifd device atapist options ATA_STATIC_ID device fdc device sio device ppc device ppbus device lpt device ppi device pmtimer device mem device apic device io device miibus device vr device uhci device ohci device usb device ucom device ugen device uhid device ukbd device ulpt device ums device uscanner </End Kernel Config> <Device Hints> hint.atkbdc.0.at="isa" hint.atkbdc.0.port="0x060" hint.atkbd.0.at="atkbdc" hint.atkbd.0.irq="1" hint.atkbd.0.flags="0x1" hint.psm.0.at="atkbdc" hint.psm.0.irq="12" hint.vga.0.at="isa" hint.sc.0.at="isa" hint.sc.0.flags="0x100" hint.fdc.0.at="isa" hint.fdc.0.port="0x3f0" hint.fdc.0.irq="6" hint.fdc.0.drq="2" hint.fd.0.at="fdc0" hint.fd.0.drive="0" hint.fd.1.at="fdc0" hint.fd.1.drive="1" hint.sio.0.at="isa" hint.sio.0.port="0x3f8" hint.sio.0.flags="0x10" hint.sio.0.irq="4" hint.sio.1.at="isa" hint.sio.1.port="0x2f8" hint.sio.1.irq="3" hint.ppc.0.at="isa" hint.ppc.0.irq="7" </End Device Hints> <Dmesg> Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-STABLE #0: Sat Jan 22 19:54:10 MST 2005 root@name.domain.tld:/usr/obj/usr/src/sys/KERNEL Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Duron(tm) processor (1297.79-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x671 Stepping = 1 Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE> AMD Features=0xc0400000<AMIE,DSP,3DNow!> real memory = 536870912 (512 MB) avail memory = 519913472 (495 MB) npx0: [FAST] npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Host to PCI bridge> pcibus 0 on motherboard pir0: <PCI Interrupt Routing Table: 9 Entries> on motherboard pci0: <PCI bus> on pcib0 agp0: <VIA Generic host to PCI bridge> mem 0xe0000000-0xe7ffffff at device 0.0 on pci0 pcib1: <PCI-PCI bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 pci0: <display, VGA> at device 8.0 (no driver attached) uhci0: <VIA 83C572 USB controller> port 0xd000-0xd01f irq 11 at device 16.0 on pci0 uhci0: [GIANT-LOCKED] usb0: <VIA 83C572 USB controller> on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 3 at device 16.1 on pci0 uhci1: [GIANT-LOCKED] usb1: <VIA 83C572 USB controller> on uhci1 usb1: USB revision 1.0 uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: <VIA 83C572 USB controller> port 0xd800-0xd81f irq 10 at device 16.2 on pci0 uhci2: [GIANT-LOCKED] usb2: <VIA 83C572 USB controller> on uhci2 usb2: USB revision 1.0 uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered pci0: <serial bus, USB> at device 16.3 (no driver attached) isab0: <PCI-ISA bridge> at device 17.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <VIA 8235 UDMA133 controller> port 0xdc00-0xdc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 17.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 pci0: <multimedia, audio> at device 17.5 (no driver attached) vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xe800-0xe8ff mem 0xed001000-0xed0010ff irq 11 at device 18.0 on pci0 miibus0: <MII bus> on vr0 ukphy0: <Generic IEEE 802.3u media interface> on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto vr0: Ethernet address: 00:0d:87:00:bf:1d cpu0 on motherboard orm0: <ISA Option ROMs> at iomem 0xc8000-0xcffff,0xc0000-0xc7fff on isa0 pmtimer0 on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0 atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 atkbd0: [GIANT-LOCKED] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse, device ID 3 fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5 irq 6 drq 2 on isa0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/16 bytes threshold ppbus0: <Parallel port bus> on ppc0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 unknown: <PNP0303> can't assign resources (port) unknown: <PNP0c02> can't assign resources (memory) unknown: <PNP0f13> can't assign resources (irq) unknown: <PNP0501> can't assign resources (port) unknown: <PNP0700> can't assign resources (port) unknown: <PNP0401> can't assign resources (port) Timecounter "TSC" frequency 1297789521 Hz quality 800 Timecounters tick every 10.000 msec ipfw2 initialized, divert enabled, rule-based forwarding disabled, default to deny, logging limited to 1000 packets/entry by default ad0: 117246MB <Maxtor 6Y120P0/YAR41VW0> [238216/16/63] at ata0-master UDMA133 acd0: CDRW <LITE-ON LTR-48246S/SS08> at ata0-slave UDMA33 ad2: 117246MB <Maxtor 6Y120P0/YAR41VW0> [238216/16/63] at ata1-master UDMA133 Mounting root from ufs:/dev/ad0s1a WARNING: / was not properly dismounted GEOM_VINUM: subdisk raid.p1.s0 is up GEOM_VINUM: subdisk raid.p0.s0 is stale GEOM_VINUM: plex sync raid.p1 -> raid.p0 started GEOM_VINUM: sd raid.p0.s0 is initializing GEOM_VINUM: plex raid.p0 is degraded GEOM_VINUM: plex raid.p0 is up GEOM_VINUM: plex sync raid.p1 -> raid.p0 finished </End Dmesg> -- Kendall Giffordhelp
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86ba954f05042111304e36b01c>
