Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 Dec 2002 10:22:21 -0800 (PST)
From:      Tenebrae <tenebrae_bsd@niceboots.com>
To:        Darren Pilgrim <dmp@pantherdragon.org>
Cc:        "Thomas T. Veldhouse" <veldy@veldy.net>, <freebsd-questions@FreeBSD.ORG>
Subject:   Re: ATA errors
Message-ID:  <20021206095518.Y9219-100000@steeltoe.niceboots.com>
In-Reply-To: <3DF0DF91.1050002@pantherdragon.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 6 Dec 2002, Darren Pilgrim wrote:

> Thomas T. Veldhouse wrote:
> > Can anybody explain what has happened here?  My machine seems to be
> > functioning normally.
> >
> > ad0: READ command timeout tag=0 serv=0 - resetting
> > ata0: resetting devices .. ata0-slave: ATA identify retries exceeded
> > done
>
> This is almost always the sign of a bad cable, but it can also be the
> logic board on the drive dying (though much rarer).  Check your cables.
>   Better yet, go to your local hardware store and buy a new ATA/100-spec
> cable, flat, not rounded, preferably with pull-loops.

I was getting errors like this recently:

ad1: READ command timeout tag=0 serv=0 - resetting
ata0-slave: timeout waiting to give command=ef s=d0 e=00
ad1: trying fallback to PIO mode
ata0: resetting devices .. done
ad1: READ command timeout tag=0 serv=0 - resetting
ata0: resetting devices .. done

This is from December 2nd.  I got a ton more similar messages in the logs
that day.  The hard drive died that night.
The sad thing is, the drive had been giving the last two lines as warning
messages every now and then for a very long time and I had ignored them.
I did try new cables.
The master drive which was on the same cable didn't have any problems.
BTW, the dead drive is an IBM Deskstar 75GXP (DTLA-307060).  I miss it.  I
wish there was some way to recover it.  30GB of data gone.  Maybe I'll try
putting it in the freezer and drop it into a different machine and see if
I can mount it...

Here's the freaky part about the whole ordeal.
I was trying to access something on that drive on the 2nd when I noticed
it was acting weird (i.e. hanging when I tried to do a directory listing).
That's when I noticed the system was actually locking up and spitting out
the resetting devices messages.  I decided to cut my losses and later that
night I removed the faulty drive from /etc/fstab and tried to unmount it.
I waited a bit and tried to mount it again.  That ended up with a hung
mount process.  I eventually got fed up with not being able to kill that
PID and rebooted the machine remotely.  Big mistake.  My system was down
until I got in the next morning and looked at the console.  The system
booted up, went through the BIOS drive detection, looked for bootable
media in the CD-ROM drive...and then just sat there at the point where it
SHOULD have tried to boot freeBSD.  I can only speculate that the system
was having a moment (or twenty) of silence for the lost hard drive.
Really, I have no logical explanation and would love to hear what
might have happened.
It took actually removing the steaming carcass of the dead hard drive from
the case before the system would boot again.  Now that is weird.

The end result of all this is that that particular error MAY be an
indication that the hard drive is, in fact, a flaky piece of junk that
will fail soon.
To be fair, I have had this hard drive for a year or two, so it's not like
I bought it a couple weeks ago and it's already failing.
Backups.
Do backups.
Backup solutions, no matter how expensive they may seem, are still cheaper
than data recovery companies.  :(
								-Tenebrae.
---
The sending of any unsolicited email advertising messages to this domain
may result in the imposition of civil liability against you in accordance
with Cal. Bus. & Prof. Code Section 17538.45.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021206095518.Y9219-100000>