Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 08 Sep 2001 18:22:47 +0100
From:      ian j hart <ianjhart@ntlworld.com>
To:        Marius Strom <marius@marius.org>
Cc:        "stable@FreeBSD.ORG" <stable@FreeBSD.ORG>
Subject:   Re: UDMA ICRC error reading fsb (?)
Message-ID:  <3B9A53E7.7479CB1A@ntlworld.com>
References:  <20010906204356.A4116@nc.rr.com> <auto-000028388966@dc-mx05.cluster1.charter.net> <20010907180403.A1472@nc.rr.com> <auto-000027449537@dc-mx04.cluster1.charter.net> <3B994E08.FF3BE9C4@ntlworld.com> <20010907234848.A1323@marius.org> <3B9A1A08.539E25ED@ntlworld.com> <20010908104324.B1323@marius.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Marius Strom wrote:

Appologies if I'm telling you what you already know, but for the benefit
of
the thread, and because this comes up regularly...

> 
> By dead I mean rapidly increasing numbers of bad sectors on the drive
> with every write.  I replaced the cable to no avail.  The hard drive had
> otherwise functioned well for ~7-8 months.  UDMA ICRC errors started
> popping up ~2 months after installation, and then last night suddenly I
> was unable to hit a slew of sectors.
> 
> HD was a WD Caviar 30 Gig.

That's not enough info. The interaction between the controller and the
HDD is important. Send a dmesg with this drive attached.

Unfortunately this is exactly the symptom you get when you have either
a cable length or a controller/HDD miss-match. Do the problems go away
when you disable UDMA with sysctl?

This looks like the same time frame as my problems at home. I suspect
that changes to the ata code (ie better UDMA support) tickled a hardware
bug. (This is on the VIA UDMA33 chipset).

When I replaced these drives the situation got WORSE because the new
drives were faster than the old. This eventually lead me to fit very
short
cables - which fixed the problem.

Don't make the mistake of thinking you have fixed your problem untill
you
verify that the old drive really is scrap. Even then new drives can give
you a new set of problems. 

Assuming you now have the drive copied try
#dd if=/dev/zero ...
with and without UDMA. Verify the block numbers are the same each time.

Make sure you zero the right drive ;)

If Maxtor have a test utility you could try that. I had 4 Seagate
drives (at work) which were apparently scrap. Two failed the Seagate
diagnostic and hit the bin at 9.8m/s/s. Two were OK and worked fine when
dropped to pio mode. Cheap and nasty M/B, with early UDMA66 controller!

> 
> On Sat, Sep 08, 2001 at 02:15:52PM +0100, ian j hart wrote:
> > Marius Strom wrote:
> > >
> > > FWIW, I had this error and assumed it was cabling.  I've spent the last
> > > few hours copying data of what is now a dead disk and putting it back on
> > > a new disk.  YMMV.
> >
> > What do you mean by "dead"? The way I describe "dead" being able to copy
> > data off it is a neat trick. eg drive doesn't spin.
> >
> > What did you do about the cabling problem?
> > What hardware do you have?
> >
> > >
> > > On Fri, Sep 07, 2001 at 11:45:28PM +0100, ian j hart wrote:
> > > > Dave Uhring wrote:
> > > > >
> > > > > On Friday 07 September 2001 17:04, Randall Hopper wrote:
> > > > > > Dave Uhring:
> > > > > >  |On Thursday 06 September 2001 07:43 pm, Randall Hopper wrote:
> > > > > >  |>      What do these messages mean?  Are CRCs done by the IDE
> > > > > >  |> controller on DMA transfers and they're coming up wrong?
> > > > > >  |>
> > > > > >  |> ad0s2a: UDMA ICRC error writing fsbn 3283483 of 396704-396713
> > > > > >  |> (ad0s2 bn 3283483; cn 204 tn 98 sn 49) retrying
> > > > > >  |
> > > > > >  |Your drive is dying.  Back it up and replace it.
> > > >
> > > > I think you are being a tad premature.
> > > >
> > > > There have been plenty of posts on this subject, both on stable and
> > > > hardware. IIRC none of them were bad disks.
> > > >
> > > > Randall,
> > > > 1) post a copy of dmesg so we can see what hardware you have.
> > > > 2) measure the cable - M/B to drive.
> > > >
> > > > > >
> > > > > > Ok, thanks.  But what do these messages "mean" on a technical level?
> > > > > >
> > > > > > And could these just as well indicate a marginal cable, bad
> > > > > > connector, loose connector, or the other hard drive on the controller
> > > > > > being a bit flakey?
> > > > > >
> > > > > > Randall
> > > > >
> > > > > CRC's (16 bit cyclic redundancy check characters) have been done on
> > > > > controllers since we had to use floppies.  The first Winchester drive
> > > > > interface I ever designed back in 1979 had a Fairchild 9401 (IIRC) CRC
> > > > > generator chip on it.  The writes are failing.  You "may" have marginal
> > > > > cabling or loose or corroded connectors.
> > > > >
> > > > > If you wish to keep using the drive, replace the cable and in doing so
> > > > > your contacts will also wipe clean.
> > > > >
> > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > > > > with "unsubscribe freebsd-stable" in the body of the message
> > > >
> > > > --
> > > > ian j hart
> > > >
> > > > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > > > with "unsubscribe freebsd-stable" in the body of the message
> > >
> > > --
> > > Marius Strom <marius@marius.org>
> > > Professional Geek/Unix System Administrator
> > > URL: http://www.marius.org/
> > > http://www.marius.org/marius.pgp 0xF5D89089 *updated 2001-02-26*
> > >
> > > It is a natural law. Physics tells us that for every action, there must be an
> > > equal and opposite reaction. They hate us, we hate them, they hate us back and
> > > so, here we are, victims of mathematics.
> > > -- Londo, "A Voice in the Wilderness I"
> >
> > --
> > ian j hart
> 
> --
> Marius Strom <marius@marius.org>
> Professional Geek/Unix System Administrator
> URL: http://www.marius.org/
> http://www.marius.org/marius.pgp 0xF5D89089 *updated 2001-02-26*
> 
> It is a natural law. Physics tells us that for every action, there must be an
> equal and opposite reaction. They hate us, we hate them, they hate us back and
> so, here we are, victims of mathematics.
> -- Londo, "A Voice in the Wilderness I"

-- 
ian j hart

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B9A53E7.7479CB1A>