Date: Fri, 28 Sep 2001 21:31:25 +0200 From: "Karsten W. Rohrbach" <karsten@rohrbach.de> To: Dave Hayes <dave@jetcafe.org> Cc: freebsd-hackers@freebsd.org Subject: Re: Problems with many ATA drives Message-ID: <20010928213125.A33572@mail.webmonster.de> In-Reply-To: <200109231643.JAA09454@hokkshideh.jetcafe.org>; from dave@jetcafe.org on Sun, Sep 23, 2001 at 09:43:25AM -0700 References: <200109231643.JAA09454@hokkshideh.jetcafe.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--fUYQa+Pmc3FrFX/N Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Dave Hayes(dave@jetcafe.org)@2001.09.23 09:43:25 +0000: > We've been attempting to set up a vinum raid box with a bunch of IDE > drives. Each drive is partitioned with a vinum partition on A, such > that the entire drive is on partition a. Initial partitioning is done > with /stand/sysinstall so it "fixes" my geometry, this has always > worked in the past. >=20 > I had been getting "funny" stuff from the drives, so I devised the > following simple test: >=20 > # dd if=3D/dev/rad1a of=3D/dev/null >=20 > This eventually produces: >=20 > ad1: READ command timeout tag=3D0 serv=3D0 - resetting > ata0: resetting devices .. done > ad1a: hard error reading fsbn 5068879 (ad1 bn 5068879; cn 315 tn 133 sn= =20 > 25)ad1a: hard error reading fsbn 5068879 (ad1 bn 5068879; cn 315 tn 133 s= n 25)=20 > status=3D59 error=3D40 >=20 > I notice 3 out of 11 drives produce this error, so far one on each > controller (ruling out a specific controller issue). I didn't want to > just assume the failure rate of 80GB IDE drives is that large, so > I'm asking this list for it's opinion: media errors due to broken qa in production? i did not have that occur with maxtor drives, yet. several ibm drives (DTLA, 45 and 75gb) were fried in my workstation the last weeks all with the same error. after reading several posts on the linux-kernel mailing list it seems to me that the smart firmware on the drives might be b0rked (the ibm case). i did no experience any problems with the maxtor 80gb (4W*) drives. to me it all boils down to this: high capacity cheap-o ide drives suck because the cut the costs in firmware development and quality assurance). for mission critical server hardware i am still building servers on scsi u3w with 32gb ibm disks (DDYS) without a single outage in hundreds of units. besides that, what cabling are you using? cheers, /k >=20 > a) Is this a bug or consequence of software drivers? (see > bug kern/17592) >=20 > b) Or is it just that IDE drives are cheap and fail this much? >=20 > Relevant data from dmesg: >=20 > atapci0: <Promise ATA100 controller> port 0xb000-0xb00f,0xb400-0xb403,0xb= 800-0x > b807,0xd000-0xd003,0xd400-0xd407 mem 0xf5800000-0xf5803fff irq 6 at devic= e=20 > 10.0 on pci2 > ata2: at 0xd400 on atapci0 > ata3: at 0xb800 on atapci0 > atapci1: <Promise ATA100 controller> port 0x9400-0x940f,0x9800-0x9803,0xa= 000-0x > a007,0xa400-0xa403,0xa800-0xa807 mem 0xf5000000-0xf5003fff irq 9 at devic= e=20 > 11.0 on pci2 > ata4: at 0xa800 on atapci1 > ata5: at 0xa000 on atapci1 > ... > atapci2: <Intel ICH2 ATA100 controller> port 0x8800-0x880f at device 31.1= on=20 > pci0 > ata0: at 0x1f0 irq 14 on atapci2 > ata1: at 0x170 irq 15 on atapci2 > ... > ad0: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata0-master UDMA100 > ad1: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata0-slave UDMA100 > ad2: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata1-master UDMA100 > ad3: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata1-slave UDMA100 > ad4: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata2-master WDMA2 > ad5: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata2-slave WDMA2 > ad6: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata3-master WDMA2 > ad7: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata3-slave WDMA2 > ad8: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata4-master WDMA2 > ad9: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata4-slave WDMA2 >=20 > Yes, we know that the "WDMA2" is happening, this state proved to be > independant of a drive failing. It has to do with 10 drives in a tower=20 > and cable lengths... =3D( > ------ > Dave Hayes - Consultant - Altadena CA, USA - dave@jetcafe.org=20 > >>> The opinions expressed above are entirely my own <<< >=20 > There is no distinctly native American criminal class except Congress. > -- Mark Twa= in >=20 >=20 >=20 > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message --=20 > To avoid criticism, do nothing, say nothing, be nothing. --Elbert Hubbard KR433/KR11-RIPE -- WebMonster Community Founder -- nGENn GmbH Senior Techie http://www.webmonster.de/ -- ftp://ftp.webmonster.de/ -- http://www.ngenn.n= et/ karsten&rohrbach.de -- alpha&ngenn.net -- alpha&scene.org -- catch@spam.de GnuPG 0x2964BF46 2001-03-15 42F9 9FFF 50D4 2F38 DBEE DF22 3340 4F4E 2964 B= F46 Please do not remove my address from To: and Cc: fields in mailing lists. 1= 0x --fUYQa+Pmc3FrFX/N Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQE7tNANM0BPTilkv0YRAuG3AJ9lvSxl4XlOopLj6Yo6Wsa1QEuCdwCeKiPk Pd/u64D32Yb6UcKhmf5uoxU= =pKF1 -----END PGP SIGNATURE----- --fUYQa+Pmc3FrFX/N-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010928213125.A33572>