Date: Sun, 23 Sep 2001 09:43:25 -0700 From: Dave Hayes <dave@jetcafe.org> To: freebsd-hackers@freebsd.org Subject: Problems with many ATA drives Message-ID: <200109231643.JAA09454@hokkshideh.jetcafe.org>
next in thread | raw e-mail | index | archive | help
We've been attempting to set up a vinum raid box with a bunch of IDE drives. Each drive is partitioned with a vinum partition on A, such that the entire drive is on partition a. Initial partitioning is done with /stand/sysinstall so it "fixes" my geometry, this has always worked in the past. I had been getting "funny" stuff from the drives, so I devised the following simple test: # dd if=/dev/rad1a of=/dev/null This eventually produces: ad1: READ command timeout tag=0 serv=0 - resetting ata0: resetting devices .. done ad1a: hard error reading fsbn 5068879 (ad1 bn 5068879; cn 315 tn 133 sn 25)ad1a: hard error reading fsbn 5068879 (ad1 bn 5068879; cn 315 tn 133 sn 25) status=59 error=40 I notice 3 out of 11 drives produce this error, so far one on each controller (ruling out a specific controller issue). I didn't want to just assume the failure rate of 80GB IDE drives is that large, so I'm asking this list for it's opinion: a) Is this a bug or consequence of software drivers? (see bug kern/17592) b) Or is it just that IDE drives are cheap and fail this much? Relevant data from dmesg: atapci0: <Promise ATA100 controller> port 0xb000-0xb00f,0xb400-0xb403,0xb800-0x b807,0xd000-0xd003,0xd400-0xd407 mem 0xf5800000-0xf5803fff irq 6 at device 10.0 on pci2 ata2: at 0xd400 on atapci0 ata3: at 0xb800 on atapci0 atapci1: <Promise ATA100 controller> port 0x9400-0x940f,0x9800-0x9803,0xa000-0x a007,0xa400-0xa403,0xa800-0xa807 mem 0xf5000000-0xf5003fff irq 9 at device 11.0 on pci2 ata4: at 0xa800 on atapci1 ata5: at 0xa000 on atapci1 ... atapci2: <Intel ICH2 ATA100 controller> port 0x8800-0x880f at device 31.1 on pci0 ata0: at 0x1f0 irq 14 on atapci2 ata1: at 0x170 irq 15 on atapci2 ... ad0: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata0-master UDMA100 ad1: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata0-slave UDMA100 ad2: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata1-master UDMA100 ad3: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata1-slave UDMA100 ad4: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata2-master WDMA2 ad5: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata2-slave WDMA2 ad6: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata3-master WDMA2 ad7: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata3-slave WDMA2 ad8: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata4-master WDMA2 ad9: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata4-slave WDMA2 Yes, we know that the "WDMA2" is happening, this state proved to be independant of a drive failing. It has to do with 10 drives in a tower and cable lengths... =( ------ Dave Hayes - Consultant - Altadena CA, USA - dave@jetcafe.org >>> The opinions expressed above are entirely my own <<< There is no distinctly native American criminal class except Congress. -- Mark Twain To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200109231643.JAA09454>