Date: Wed, 18 Oct 2000 08:05:56 -0700 (PDT) From: rasmith@aristotle.tamu.edu To: freebsd-gnats-submit@FreeBSD.org Subject: kern/22086: DMA errors during intensive disk activity on vinum volume Message-ID: <20001018150556.5946F37B4E5@hub.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 22086
>Category: kern
>Synopsis: DMA errors during intensive disk activity on vinum volume
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Wed Oct 18 08:10:01 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Robin Smith
>Release: 4.1.1-RELEASE #1
>Organization:
Dept. of Philosophy, Texas A&M University
>Environment:
FreeBSD test-drive.tamu.edu 4.1.1-RELEASE FreeBSD 4.1.1-RELEASE #1: Tue Oct 10 18:09:51 CDT 2000 root@test-drive.tamu.edu:/usr/src/sys/compile/TEST-DRIVE i386
>Description:
System is a dual-PIII, 512MB, 4 AT disks (see below); a raid5 vinum
volume is mounted as /usr (striped across all four disks). Heavy access
to a vinum raid5 device causes DMA errors; if the system does not
recover (i.e. a sync fails after multiple attempts), a reboot is triggered.
The initial error message is of the following form:
Oct 18 08:36:19 test-drive /kernel: ata0-slave: zero length DMA transfer attempted
This message may mention either controller. If there is no recovery,
then either the terminal hangs indefinitely or the system restarts.
From a hang caused by 'du /usr' (which worked its way to /usr/src/games
before giving up:
Oct 18 08:36:19 test-drive /kernel: ata0-slave: zero length DMA transfer attempted
Oct 18 08:36:29 test-drive /kernel: ad1: WRITE command timeout - resetting
Oct 18 08:36:29 test-drive /kernel: ata0: resetting devices .. done
Oct 18 08:36:29 test-drive /kernel: ata0-slave: zero length DMA transfer attempted
The error message "zero length DMA transfer attempted" is found in
/
>How-To-Repeat:
any intensive disk activity on the vinum raid5 volume triggers it. A good
choice is 'du /usr'.
Here is some tracking information on the error message "zero length DMA transfer"
512: grep -C10 "zero length DMA transfer" /usr/src/sys/dev/ata/ata-dma.c
ata_dmasetup(struct ata_softc *scp, int32_t device,
int8_t *data, int32_t count, int32_t flags)
### ^^^^ ARGUMENT 4
{
struct ata_dmaentry *dmatab;
u_int32_t dma_count, dma_base;
int i = 0;
if (((uintptr_t)data & 1) || (count & 1))
return -1;
##### FOR COUNT=0
if (!count) {
ata_printf(scp, device, "zero length DMA transfer attempted\n");
return -1;
}
dmatab = scp->dmatab[ATA_DEV(device)];
dma_base = vtophys(data);
dma_count = min(count, (PAGE_SIZE - ((uintptr_t)data & PAGE_MASK)));
data += dma_count;
count -= dma_count;
while (count) {
Called by:
516: grep dmasetup /usr/src/sys/dev/ata/* |less
/usr/src/sys/dev/ata/ata-all.h:int32_t ata_dmasetup(struct ata_softc *, int32_t,
int8_t *, int32_t, int32_t);
/usr/src/sys/dev/ata/ata-disk.c: !ata_dmasetup(adp->controller, adp->
unit,
/usr/src/sys/dev/ata/ata-dma.c:ata_dmasetup(struct ata_softc *scp, int32_t devic
e,
/usr/src/sys/dev/ata/ata-dma.c:ata_dmasetup(struct ata_softc *scp, int32_t devic
e,
/usr/src/sys/dev/ata/atapi-all.c: !ata_dmasetup(atp->controller, atp->unit
,
In ad_transfer(struct ad_request *request):
524: grep -C6 -n dmasetup /usr/src/sys/dev/ata/ata-disk.c
394-
395- devstat_start_transaction(&adp->stats);
396-
397- /* does this drive & transfer work with DMA ? */
398- request->flags &= ~ADR_F_DMA_USED;
399- if ((adp->controller->mode[ATA_DEV(adp->unit)] >= ATA_DMA) &&
400: !ata_dmasetup(adp->controller, adp->unit,
401- (void *)request->data, request->bytecount,
####### ^^^^^^^^^ = count ####
402- (request->flags & ADR_F_READ))) {
403- request->flags |= ADR_F_DMA_USED;
404- cmd = request->flags&ADR_F_READ ? ATA_C_READ_DMA : ATA_C_WRITE_DMA;
405- request->currentsize = request->bytecount;
406- }
The only call to ata_dmasetup in ata-disk is from ad_transfer.
There are two calls to ad_transfer:
At line 570:
else {
request->bytecount -= request->currentsize;
request->donecount += request->currentsize;
if (request->bytecount > 0) {
ad_transfer(request);
return ATA_OP_CONTINUES;
}
At line 290:
while (request.bytecount > 0) {
ad_transfer(&request);
Both are protected by comparisons; bytecount==0 should never happen.
root@test-drive [/usr/home/rasmith]
525: grep -C6 -n dmasetup /usr/src/sys/dev/ata/atapi-all.c
254- if ((atp->controller->mode[ATA_DEV(atp->unit)] >= ATA_DMA) &&
255- (request->ccb[0] == ATAPI_READ ||
256- request->ccb[0] == ATAPI_READ_BIG ||
257- ((request->ccb[0] == ATAPI_WRITE ||
258- request->ccb[0] == ATAPI_WRITE_BIG) &&
259- !(atp->controller->flags & ATA_ATAPI_DMA_RO))) &&
260: !ata_dmasetup(atp->controller, atp->unit,
261- (void *)request->data, request->bytecount,
########### ^^^^^^^^^^^^^ = count ####
262- request->flags & ATPR_F_READ)) {
263- request->flags |= ATPR_F_DMA_USED;
264- }
265-
266- /* start ATAPI operation */
518: cat /var/log/mount.today
/dev/ad0s1a / ufs rw 1 1
/dev/ad0s1e /mnt/tmp ufs rw 2 2
/dev/ad0s1f /var ufs rw 2 2
/dev/vinum/raid5 /usr ufs rw 2 2
procfs /proc procfs rw 0 0
520: cat /var/log/dmesg.today
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 4.1.1-RELEASE #1: Tue Oct 10 18:09:51 CDT 2000
root@test-drive.tamu.edu:/usr/src/sys/compile/TEST-DRIVE
Timecounter "i8254" frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (501.14-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x683 Stepping = 3
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV
,PAT,PSE36,MMX,FXSR,XMM>
real memory = 536870912 (524288K bytes)
avail memory = 518524928 (506372K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
IOAPIC #0 intpin 16 -> irq 5
IOAPIC #0 intpin 17 -> irq 11
IOAPIC #0 intpin 18 -> irq 10
IOAPIC #0 intpin 19 -> irq 9
FreeBSD/SMP: Multiprocessor motherboard
cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000
cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000
io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000
Preloaded elf kernel "kernel" at 0xc041d000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82443GX host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pcib2: <Intel 82443GX (440 GX) PCI-PCI (AGP) bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib2
pci1: <Trident model 9750 VGA-compatible display device> at 0.0 irq 11
isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 ATA33 controller> port 0xffa0-0xffaf at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: <Intel 82371AB/EB (PIIX4) USB controller> port 0xef80-0xef9f irq 9 at devic
e 7.2 on pci0
usb0: <Intel 82371AB/EB (PIIX4) USB controller> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Timecounter "PIIX" frequency 3579545 Hz
chip1: <Intel 82371AB Power management controller> port 0x440-0x44f at device 7.3
on pci0
xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0xec00-0xec7f mem 0xfebfef80-0xfebfef
ff irq 5 at device 15.0 on pci0
xl0: Ethernet address: 00:01:02:be:6d:85
miibus0: <MII bus> on xl0
xlphy0: <3Com internal media interface> on miibus0
xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl1: <3Com 3c905B-TX Fast Etherlink XL> port 0xe480-0xe4ff mem 0xfebfef00-0xfebfef
7f irq 11 at device 16.0 on pci0
xl1: Ethernet address: 00:01:02:be:6d:41
miibus1: <MII bus> on xl1
xlphy1: <3Com internal media interface> on miibus1
xlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xe800-0xe8ff mem 0xfebff000-0xfe
bfffff irq 10 at device 18.0 on pci0
aic7892: Wide Channel A, SCSI Id=7, 32/255 SCBs
pcib1: <Intel 82443GX host to AGP bridge> on motherboard
pci2: <PCI bus> on pcib1
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model MouseMan+, device ID 0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
lip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
SMP: AP CPU #1 Launched!
ad0: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata0-master using UDMA33
ad1: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata0-slave using UDMA33
ad2: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata1-master using UDMA33
ad3: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata1-slave using UDMA33
Waiting 15 seconds for SCSI devices to settle
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
vinum: loaded
vinum: reading configuration from /dev/ad3s1e
vinum: updating configuration from /dev/ad2s1e
vinum: updating configuration from /dev/ad1s1e
vinum: updating configuration from /dev/ad0s1h
cd0 at ahc0 bus 0 target 4 lun 0
cd0: <TOSHIBA CD-ROM XM-6401TA 1001> Removable CD-ROM SCSI-2 device
cd0: 20.000MB/s transfers (20.000MHz, offset 16)
cd0: cd present [328499 x 2048 byte records]
>Fix:
None known
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001018150556.5946F37B4E5>
