Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Apr 2005 12:30:30 -0600
From:      Kendall Gifford <zettabyte@gmail.com>
To:        freebsd-hardware@freebsd.org
Subject:   ATA DMA Issues Resurfaced (READ_DMA TIMEOUT/FAILURE)
Message-ID:  <86ba954f05042111304e36b01c@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Howdy. I'm not sure whether hardware or stable is the best list for
this, but here is my problem. Any info, recommendations, or help will
be greatly appreciated.

I've got a server running 5-STABLE (updated/built Jan. 22, 2005). It
has been running this kernel, a 5.3-RELEASE kernel, and other 5.x
branch versions for the last ten or so months now. Previous to this,
it was running 4.9-RELEASE.

About ten months ago, when I switched from the 4.x branch to the 5.x
branch, I immediately began experiencing WRITE_DMA ICRC errors durring
disk activity at seemingly random times. At that time I posted to this
list and questions the following message:

http://groups-beta.google.com/group/mailing.freebsd.questions/browse_thread=
/thread/17fe5871d823f380/a16568320427152e?rnum=3D2#a16568320427152e

The gist of the message and my current experience is that my hardware
(drives, cables, motherboard controllers, etc.) is definately fine and
that I've noticed others posting various, possibly-related issues both
before and since I posted the above message. I basically ended up
working around the problem by running atacontrol in a
/usr/local/etc/rc.d/ script that set my drives to PIO4 mode. I then
mostly forgot about the problem as everything has since worked
fine--that is until just recently.

About a week ago (around April 14, 2005) after performing some updates
of some ports and configurations, I decided to perform a reboot (quite
extranous, I know, but reassuring to verify that all scripts/configs
are properly set up the way I want). Just as my system began starting
local services, and just after it ran my custom /usr/local/etc/rc.d
atacontrol script, I got the following error messages:

<Screenshot>

Master =3D PIO4
Slave  =3D UDMA33
Master =3D PIO4
Slave  =3D BIOSPIO
ad0: TIMEOUT - READ_DMA retrying (2 retries left) LBA=3D146793208
ad0: FAILURE - READ_DMA timed out
GEOM_VINUM: subdisk raid.p0.s0 is down
GEOM_VINUM: plex raid.p0 is down
Starting mysql.


Fatal trap 12: page fault while in kernel mode
fault virtual addess =3D 0xc
fault code =3D supervisor read, page not present
instruction pointer =3D 0x8:0xc04ba88f
stack pointer =3D 0x10:0xd321dc6c
frame pointer =3D 0x10:0xd321dc98
code segment =3D base 0x0, limit 0xfffff, type 0x1b
             =3D DPL0, pres 1, def32 1, gran1
processor eflags =3D interrupt enabled, resume, IOPL=3D 0
current process  =3D 4 (g_down)
trap number =3D 12
panic: page fault
Uptime: 28s

</End Screenshot>

This is the first time in ten months I've had issues switching to PIO4
mode during local service startup. I really am not quite sure what
happened.

Anyhow, I've since rebooted into single-user mode, brought my
gvinum-mirror plex back up, and the usual stuff to manually bring my
system up. But, I did have one attempt at doing this when I foolishly
forgot to manually atacontrol my drives before trying to bring my
gvinum plex back up. As it was restoring in the background, I
remembered and unthinkingly ran atacontrol and again succeeded in
bringing my system down in much the same manner as shown above (only
this time with WRITE_DMA errors instead of READ_DMA errors).

Anyhow, based on this experience, my two guesses as to the cause of my
booting problem is that disk activity from starting the system is
causing problems before my disks can be put fully in PIO4 mode (and
timing is immaculate) or that the current state of things when
atacontrol is executed causes problems. As you can see, I have no idea
what the real problem is and wonder if any more info on this/these
ata/dma problems is available. I wonder if I'd be better off moving to
4.11 until the root cause of these problems is found.

Any help or information anyone?

System Info:


<Kernel Config>
machine=09=09i386
cpu=09=09I686_CPU
device=09=09npx
device=09=09isa
device=09=09pci
device=09=09agp
options=09=09VESA
ident=09=09KERNEL
maxusers=09100
options=09=09SCHED_4BSD
options=09=09COMPAT_43
options=09=09COMPAT_FREEBSD4
options=09=09SYSVSHM
options=09=09SYSVSEM
options=09=09SYSVMSG
options=09=09KTRACE
options=09=09INVARIANT_SUPPORT
options=09=09INET
device=09=09ether
device=09=09loop
device=09=09bpf
device=09=09tun
options=09=09IPFIREWALL
options=09=09IPFIREWALL_VERBOSE
options=09=09IPFIREWALL_VERBOSE_LIMIT=3D1000
options=09=09IPDIVERT
options=09=09FFS
options=09=09NFSCLIENT
options=09=09NFSSERVER
options=09=09CD9660
options=09=09FDESCFS
options=09=09MSDOSFS
options=09=09NTFS
options=09=09NULLFS
options=09=09PROCFS
options=09=09PSEUDOFS
options=09=09UDF
options=09=09SOFTUPDATES
options=09=09UFS_EXTATTR
options=09=09UFS_EXTATTR_AUTOSTART
options=09=09UFS_ACL
options=09=09GEOM_BSD
options=09=09GEOM_CONCAT
options=09=09GEOM_GPT
options=09=09GEOM_LABEL
options=09=09GEOM_MBR
options=09=09GEOM_MIRROR
options=09=09GEOM_VOL
options=09=09QUOTA
device=09=09md
device=09=09random
device=09=09pty
device=09=09snp
options=09=09_KPOSIX_PRIORITY_SCHEDULING
device=09=09atkbdc
device=09=09atkbd
device=09=09psm
device=09=09vga
device=09=09splash
device=09=09sc
options=09=09MAXCONS=3D16
options=09=09SC_HISTORY_SIZE=3D2000
options=09=09SC_TWOBUTTON_MOUSE
options=09=09SC_KERNEL_CONS_ATTR=3D(FG_RED|BG_BLACK)
options=09=09SC_KERNEL_CONS_REV_ATTR=3D(FG_BLACK|BG_RED)
device=09=09ata
device=09=09atadisk
device=09=09ataraid
device=09=09atapicd
device=09=09atapifd
device=09=09atapist
options =09ATA_STATIC_ID
device=09=09fdc
device=09=09sio
device=09=09ppc
device=09=09ppbus
device=09=09lpt
device=09=09ppi
device=09=09pmtimer
device=09=09mem
device=09=09apic
device=09=09io
device=09=09miibus
device=09=09vr
device=09=09uhci
device=09=09ohci
device=09=09usb
device=09=09ucom
device=09=09ugen
device=09=09uhid
device=09=09ukbd
device=09=09ulpt
device=09=09ums
device=09=09uscanner
</End Kernel Config>


<Device Hints>
hint.atkbdc.0.at=3D"isa"
hint.atkbdc.0.port=3D"0x060"
hint.atkbd.0.at=3D"atkbdc"
hint.atkbd.0.irq=3D"1"
hint.atkbd.0.flags=3D"0x1"
hint.psm.0.at=3D"atkbdc"
hint.psm.0.irq=3D"12"
hint.vga.0.at=3D"isa"
hint.sc.0.at=3D"isa"
hint.sc.0.flags=3D"0x100"
hint.fdc.0.at=3D"isa"
hint.fdc.0.port=3D"0x3f0"
hint.fdc.0.irq=3D"6"
hint.fdc.0.drq=3D"2"
hint.fd.0.at=3D"fdc0"
hint.fd.0.drive=3D"0"
hint.fd.1.at=3D"fdc0"
hint.fd.1.drive=3D"1"
hint.sio.0.at=3D"isa"
hint.sio.0.port=3D"0x3f8"
hint.sio.0.flags=3D"0x10"
hint.sio.0.irq=3D"4"
hint.sio.1.at=3D"isa"
hint.sio.1.port=3D"0x2f8"
hint.sio.1.irq=3D"3"
hint.ppc.0.at=3D"isa"
hint.ppc.0.irq=3D"7"
</End Device Hints>


<Dmesg>
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
=09The Regents of the University of California. All rights reserved.
FreeBSD 5.3-STABLE #0: Sat Jan 22 19:54:10 MST 2005
    root@name.domain.tld:/usr/obj/usr/src/sys/KERNEL
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Duron(tm) processor (1297.79-MHz 686-class CPU)
  Origin =3D "AuthenticAMD"  Id =3D 0x671  Stepping =3D 1
  Features=3D0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,=
CMOV,PAT,PSE36,MMX,FXSR,SSE>
  AMD Features=3D0xc0400000<AMIE,DSP,3DNow!>
real memory  =3D 536870912 (512 MB)
avail memory =3D 519913472 (495 MB)
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> pcibus 0 on motherboard
pir0: <PCI Interrupt Routing Table: 9 Entries> on motherboard
pci0: <PCI bus> on pcib0
agp0: <VIA Generic host to PCI bridge> mem 0xe0000000-0xe7ffffff at
device 0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci0: <display, VGA> at device 8.0 (no driver attached)
uhci0: <VIA 83C572 USB controller> port 0xd000-0xd01f irq 11 at device
16.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <VIA 83C572 USB controller> on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 3 at device
16.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <VIA 83C572 USB controller> on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <VIA 83C572 USB controller> port 0xd800-0xd81f irq 10 at device
16.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: <VIA 83C572 USB controller> on uhci2
usb2: USB revision 1.0
uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
pci0: <serial bus, USB> at device 16.3 (no driver attached)
isab0: <PCI-ISA bridge> at device 17.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <VIA 8235 UDMA133 controller> port
0xdc00-0xdc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 17.1 on
pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
pci0: <multimedia, audio> at device 17.5 (no driver attached)
vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xe800-0xe8ff mem
0xed001000-0xed0010ff irq 11 at device 18.0 on pci0
miibus0: <MII bus> on vr0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
vr0: Ethernet address: 00:0d:87:00:bf:1d
cpu0 on motherboard
orm0: <ISA Option ROMs> at iomem 0xc8000-0xcffff,0xc0000-0xc7fff on isa0
pmtimer0 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=3D0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
unknown: <PNP0303> can't assign resources (port)
unknown: <PNP0c02> can't assign resources (memory)
unknown: <PNP0f13> can't assign resources (irq)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0700> can't assign resources (port)
unknown: <PNP0401> can't assign resources (port)
Timecounter "TSC" frequency 1297789521 Hz quality 800
Timecounters tick every 10.000 msec
ipfw2 initialized, divert enabled, rule-based forwarding disabled,
default to deny, logging limited to 1000 packets/entry by default
ad0: 117246MB <Maxtor 6Y120P0/YAR41VW0> [238216/16/63] at ata0-master UDMA1=
33
acd0: CDRW <LITE-ON LTR-48246S/SS08> at ata0-slave UDMA33
ad2: 117246MB <Maxtor 6Y120P0/YAR41VW0> [238216/16/63] at ata1-master UDMA1=
33
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
GEOM_VINUM: subdisk raid.p1.s0 is up
GEOM_VINUM: subdisk raid.p0.s0 is stale
GEOM_VINUM: plex sync raid.p1 -> raid.p0 started
GEOM_VINUM: sd raid.p0.s0 is initializing
GEOM_VINUM: plex raid.p0 is degraded
GEOM_VINUM: plex raid.p0 is up
GEOM_VINUM: plex sync raid.p1 -> raid.p0 finished
</End Dmesg>

--
Kendall Gifford



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86ba954f05042111304e36b01c>