Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Jan 2009 02:40:36 GMT
From:      Dylan Simon <dylan@dylex.net>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/130726: DMA errors accessing multiple SATA channels
Message-ID:  <200901190240.n0J2eaBO040023@www.freebsd.org>
Resent-Message-ID: <200901190250.n0J2o3Pq047206@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         130726
>Category:       kern
>Synopsis:       DMA errors accessing multiple SATA channels
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jan 19 02:50:03 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Dylan Simon
>Release:        8.0-CURRENT 20090114
>Organization:
NYU
>Environment:
FreeBSD lust.cns.nyu.edu 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Wed Jan 14 19:58:58 EST 2009     dylan@lust.cns.nyu.edu:/usr/obj/usr/src/sys/SIN  amd64
>Description:
kernel: ad8: FAILURE - load data
kernel: ad8: setting up DMA failed
kernel: g_vfs_done():ad8s1e[WRITE(offset=1881014272, length=131072)]error = 5
kernel: ad6: FAILURE - load data
kernel: ad6: setting up DMA failed
kernel: g_vfs_done():ad6s1e[READ(offset=4117364736, length=32768)]error = 5
kernel: vnode_pager_getpages: I/O read error

Disk errors of this form occur after a few minutes of disk load.  It seems to occur only when operations are attempting to write to SATA disks on different channels at roughly the same time.  Eventually results in panics or hangs.

Occurs with GENERIC 200812 snapshot kernel too.  Occurs regardless of legacy/enhanced BIOS setting.  Seen with gmirror, ufs, and zfs under both nfs and local access.  Does not occur at all when accessed disks are on the same channel (e.g., ad6+ad7 or ad8 alone in this case).  Does not occur on same hardware with linux under similar conditions.

Hardware: Supermicro C2SEA with 6 SATA ports on ICH10, four identical disks under various configurations.

Symptoms matching this have also been seen on an ICH7 with two disks on different channels on hardware that works fine with 7.1, triggered by daily periodic scripts.

atacontrol list:
ATA channel 2:
    Master: acd0 <ATAPI DVD D DH16D3P/1P52> ATA/ATAPI revision 7
    Slave:       no device present
ATA channel 3:
    Master:  ad6 <ST31000333AS/CC1F> Serial ATA II
    Slave:   ad7 <ST31000333AS/CC1F> Serial ATA II
ATA channel 4:
    Master:  ad8 <ST31000333AS/CC1F> Serial ATA II
    Slave:   ad9 <ST31000333AS/CC1F> Serial ATA II
ATA channel 5:
    Master:      no device present
    Slave:       no device present
ATA channel 6:
    Master:      no device present
    Slave:       no device present

pciconf -lv (partial):
pcib3@pci0:0:30:0:      class=0x060401 card=0xb88015d9 chip=0x244e8086 rev=0x90 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801 Family (ICH2/3/4/4/5/5/6/7/8/9,63xxESB) Hub Interface to PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
isab0@pci0:0:31:0:      class=0x060100 card=0xb88015d9 chip=0x3a188086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    class      = bridge
    subclass   = PCI-ISA
atapci1@pci0:0:31:2:    class=0x01018f card=0xb88015d9 chip=0x3a208086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    class      = mass storage
    subclass   = ATA
atapci2@pci0:0:31:5:    class=0x010185 card=0xb88015d9 chip=0x3a268086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    class      = mass storage
    subclass   = ATA

verbose dmesg (partial):
lust kernel: atapci0: <ITE IT8213F UDMA133 controller> port 0xec00-0xec07,0xe880-0xe883,0xe800-0xe807,0xe480-0xe483,0xe400-0xe40f irq 22 at device 4.0 on pci3
lust kernel: atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xe400
lust kernel: ioapic0: routing intpin 22 (PCI IRQ 22) to vector 53
lust kernel: atapci0: [MPSAFE]
lust kernel: atapci0: [ITHREAD]
lust kernel: ata2: <ATA channel 0> on atapci0
lust kernel: atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0xec00
lust kernel: atapci0: Reserved 0x4 bytes for rid 0x14 type 4 at 0xe880
lust kernel: ata2: reset tp1 mask=03 ostat0=50 ostat1=00
lust kernel: ata2: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb
lust kernel: ata2: stat1=0x00 err=0x00 lsb=0x00 msb=0x00
lust kernel: ata2: reset tp2 stat0=00 stat1=00 devices=0x10000
lust kernel: ata2: [MPSAFE]
lust kernel: ata2: [ITHREAD]
lust kernel: pci3: <serial bus, FireWire> at device 8.0 (no driver attached)
lust kernel: isab0: <PCI-ISA bridge> at device 31.0 on pci0
lust kernel: isa0: <ISA bus> on isab0
lust kernel: atapci1: <Intel ICH10 SATA300 controller> port 0xc400-0xc407,0xc080-0xc083,0xc000-0xc007,0xbc00-0xbc03,0xb880-0xb88f,0xb800-0xb80f irq 19 at device 31.2 on pci0
lust kernel: atapci1: Reserved 0x10 bytes for rid 0x20 type 4 at 0xb880
lust kernel: atapci1: [MPSAFE]
lust kernel: atapci1: [ITHREAD]
lust kernel: atapci1: Reserved 0x10 bytes for rid 0x24 type 4 at 0xb800
lust kernel: ata3: <ATA channel 0> on atapci1
lust kernel: atapci1: Reserved 0x8 bytes for rid 0x10 type 4 at 0xc400
lust kernel: atapci1: Reserved 0x4 bytes for rid 0x14 type 4 at 0xc080
lust kernel: ata3: reset tp1 mask=03 ostat0=50 ostat1=50
lust kernel: ata3: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
lust kernel: ata3: stat1=0x50 err=0x01 lsb=0x00 msb=0x00
lust kernel: ata3: reset tp2 stat0=50 stat1=50 devices=0x3
lust kernel: ata3: [MPSAFE]
lust kernel: ata3: [ITHREAD]
lust kernel: ata4: <ATA channel 1> on atapci1
lust kernel: atapci1: Reserved 0x8 bytes for rid 0x18 type 4 at 0xc000
lust kernel: atapci1: Reserved 0x4 bytes for rid 0x1c type 4 at 0xbc00
lust kernel: ata4: reset tp1 mask=03 ostat0=50 ostat1=50
lust kernel: ata4: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
lust kernel: ata4: stat1=0x50 err=0x01 lsb=0x00 msb=0x00
lust kernel: ata4: reset tp2 stat0=50 stat1=50 devices=0x3
lust kernel: ata4: [MPSAFE]
lust kernel: ata4: [ITHREAD]
lust kernel: pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
lust kernel: atapci2: <Intel ICH10 SATA300 controller> port 0xb400-0xb407,0xb080-0xb083,0xb000-0xb007,0xac00-0xac03,0xa880-0xa88f,0xa800-0xa80f irq 19 at device 31.5 on pci0
lust kernel: atapci2: Reserved 0x10 bytes for rid 0x20 type 4 at 0xa880
lust kernel: atapci2: [MPSAFE]
lust kernel: atapci2: [ITHREAD]
lust kernel: atapci2: Reserved 0x10 bytes for rid 0x24 type 4 at 0xa800
lust kernel: ata5: <ATA channel 0> on atapci2
lust kernel: atapci2: Reserved 0x8 bytes for rid 0x10 type 4 at 0xb400
lust kernel: atapci2: Reserved 0x4 bytes for rid 0x14 type 4 at 0xb080
lust kernel: ata5: reset tp1 mask=03 ostat0=7f ostat1=7f
lust kernel: ata5: stat0=0x7f err=0xff lsb=0xff msb=0xff
lust kernel: ata5: stat1=0x7f err=0xff lsb=0xff msb=0xff
lust kernel: ata5: reset tp2 stat0=ff stat1=ff devices=0x0
lust kernel: ata5: [MPSAFE]
lust kernel: ata5: [ITHREAD]
lust kernel: ata6: <ATA channel 1> on atapci2
lust kernel: atapci2: Reserved 0x8 bytes for rid 0x18 type 4 at 0xb000
lust kernel: atapci2: Reserved 0x4 bytes for rid 0x1c type 4 at 0xac00
lust kernel: ata6: reset tp1 mask=03 ostat0=7f ostat1=7f
lust kernel: ata6: stat0=0x7f err=0xff lsb=0xff msb=0xff
lust kernel: ata6: stat1=0x7f err=0xff lsb=0xff msb=0xff
lust kernel: ata6: reset tp2 stat0=ff stat1=ff devices=0x0
lust kernel: ata6: [MPSAFE]
lust kernel: ata6: [ITHREAD]
lust kernel: ata2: identify ch->devices=00010000
lust kernel: ata2-master: pio=PIO4 wdma=WDMA2 udma=UDMA33 cable=40 wire
lust kernel: acd0: setting PIO4 on IT8213F chip
lust kernel: acd0: setting UDMA33 on IT8213F chip
lust kernel: acd0: <ATAPI DVD D DH16D3P/1P52> DVDROM drive at ata2 as master
lust kernel: acd0: read 8268KB/s (8268KB/s), 198KB buffer, UDMA33
lust kernel: acd0: Reads: CDR, CDRW, CDDA stream, DVDROM, DVDR, DVDRAM, packet
lust kernel: acd0: Writes:
lust kernel: acd0: Audio: play, 256 volume levels
lust kernel: acd0: Mechanism: ejectable tray, unlocked
lust kernel: acd0: Medium: no/blank disc
lust kernel: ata3: identify ch->devices=00000003
lust kernel: ata3-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
lust kernel: ata3-slave: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
lust kernel: ad6: 953869MB <Seagate ST31000333AS CC1F> at ata3-master SATA300
lust kernel: ad6: 1953525168 sectors [1938021C/16H/63S] 16 sectors/interrupt 1 depth queue
lust kernel: GEOM: new disk ad6
lust kernel: ad7: 953869MB <Seagate ST31000333AS CC1F> at ata3-slave SATA300
lust kernel: ad7: 1953525168 sectors [1938021C/16H/63S] 16 sectors/interrupt 1 depth queue
lust kernel: ata4: identify ch->devices=00000003
lust kernel: ata4-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
lust kernel: GEOM: new disk ad7
lust kernel: ata4-slave: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
lust kernel: ad8: 953869MB <Seagate ST31000333AS CC1F> at ata4-master SATA300
lust kernel: ad8: 1953525168 sectors [1938021C/16H/63S] 16 sectors/interrupt 1 depth queue
lust kernel: GEOM: new disk ad8
lust kernel: ad9: 953869MB <Seagate ST31000333AS CC1F> at ata4-slave SATA300
lust kernel: ad9: 1953525168 sectors [1938021C/16H/63S] 16 sectors/interrupt 1 depth queue
lust kernel: ata5: identify ch->devices=00000000
lust kernel: ata6: identify ch->devices=00000000
lust kernel: ioapic0: Assigning ISA IRQ 1 to local APIC 0
lust kernel: ioapic0: Assigning ISA IRQ 9 to local APIC 1
lust kernel: ioapic0: Assigning PCI IRQ 17 to local APIC 0
lust kernel: ioapic0: Assigning PCI IRQ 18 to local APIC 1
lust kernel: ioapic0: Assigning PCI IRQ 19 to local APIC 0
lust kernel: ioapic0: Assigning PCI IRQ 22 to local APIC 1
lust kernel: ioapic0: Assigning PCI IRQ 23 to local APIC 0
lust kernel: GEOM: new disk ad9
>How-To-Repeat:
Perform access involving writes to SATA disks on different channels of an ICH controller.
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200901190240.n0J2eaBO040023>