From owner-freebsd-hackers  Sun Sep 23  9:43:34 2001
Delivered-To: freebsd-hackers@freebsd.org
Received: from hokkshideh.jetcafe.org (hokkshideh.jetcafe.org [205.147.43.4])
	by hub.freebsd.org (Postfix) with ESMTP id 7738237B438
	for <freebsd-hackers@freebsd.org>; Sun, 23 Sep 2001 09:43:26 -0700 (PDT)
Received: from hokkshideh.jetcafe.org (localhost [127.0.0.1])
	by hokkshideh.jetcafe.org (8.8.8/8.8.5) with ESMTP id JAA09454
	for <freebsd-hackers@freebsd.org>; Sun, 23 Sep 2001 09:43:25 -0700 (PDT)
Message-Id: <200109231643.JAA09454@hokkshideh.jetcafe.org>
X-Mailer: exmh version 2.2 06/23/2000 with version: MH 6.8.4 #1[UCI]
To: freebsd-hackers@freebsd.org
Subject: Problems with many ATA drives
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sun, 23 Sep 2001 09:43:25 -0700
From: Dave Hayes <dave@jetcafe.org>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG

We've been attempting to set up a vinum raid box with a bunch of IDE
drives. Each drive is partitioned with a vinum partition on A, such
that the entire drive is on partition a. Initial partitioning is done
with /stand/sysinstall so it "fixes" my geometry, this has always
worked in the past.

I had been getting "funny" stuff from the drives, so I devised the
following simple test:

# dd if=/dev/rad1a of=/dev/null

This eventually produces:

ad1: READ command timeout tag=0 serv=0 - resetting
ata0: resetting devices .. done
ad1a: hard error reading fsbn 5068879 (ad1 bn 5068879; cn 315 tn 133 sn 
25)ad1a: hard error reading fsbn 5068879 (ad1 bn 5068879; cn 315 tn 133 sn 25) 
status=59 error=40

I notice 3 out of 11 drives produce this error, so far one on each
controller (ruling out a specific controller issue). I didn't want to
just assume the failure rate of 80GB IDE drives is that large, so
I'm asking this list for it's opinion:

a) Is this a bug or consequence of software drivers? (see
bug kern/17592)

b) Or is it just that IDE drives are cheap and fail this much?

Relevant data from dmesg:

atapci0: <Promise ATA100 controller> port 0xb000-0xb00f,0xb400-0xb403,0xb800-0x
b807,0xd000-0xd003,0xd400-0xd407 mem 0xf5800000-0xf5803fff irq 6 at device 
10.0 on pci2
ata2: at 0xd400 on atapci0
ata3: at 0xb800 on atapci0
atapci1: <Promise ATA100 controller> port 0x9400-0x940f,0x9800-0x9803,0xa000-0x
a007,0xa400-0xa403,0xa800-0xa807 mem 0xf5000000-0xf5003fff irq 9 at device 
11.0 on pci2
ata4: at 0xa800 on atapci1
ata5: at 0xa000 on atapci1
...
atapci2: <Intel ICH2 ATA100 controller> port 0x8800-0x880f at device 31.1 on 
pci0
ata0: at 0x1f0 irq 14 on atapci2
ata1: at 0x170 irq 15 on atapci2
...
ad0: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata0-master UDMA100
ad1: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata0-slave UDMA100
ad2: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata1-master UDMA100
ad3: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata1-slave UDMA100
ad4: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata2-master WDMA2
ad5: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata2-slave WDMA2
ad6: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata3-master WDMA2
ad7: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata3-slave WDMA2
ad8: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata4-master WDMA2
ad9: 78167MB <Maxtor 4W080H6> [158816/16/63] at ata4-slave WDMA2

Yes, we know that the "WDMA2" is happening, this state proved to be
independant of a drive failing. It has to do with 10 drives in a tower 
and cable lengths... =(
------
Dave Hayes - Consultant - Altadena CA, USA - dave@jetcafe.org 
>>> The opinions expressed above are entirely my own <<<

There is no distinctly native American criminal class except Congress.
                                                              -- Mark Twain



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message