Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Aug 2009 14:13:22 -0600
From:      Tim Judd <tajudd@gmail.com>
To:        Kelly Martin <kellymartin@gmail.com>
Cc:        FreeBSD Questions <freebsd-questions@freebsd.org>
Subject:   Re: hard disk failure - now what?
Message-ID:  <ade45ae90908241313y495832edkd87004485602a42e@mail.gmail.com>
In-Reply-To: <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com>
References:  <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 8/24/09, Kelly Martin <kellymartin@gmail.com> wrote:
> I just experienced a hard drive failure on one of my FreeBSD 7.2
> production servers with no backup! I am so mad at myself for not
> backing up!! Now it's a salvage operation. Here are the type of errors
> I was getting on the console, over-and-over:
>
> ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=441633503
> ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout -
> completing request directly
> ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout -
> completing request directly
> ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
> ad4: FAILURE - WRITE_DMA48 timed out LBA=441633375
> g_vgs_done():ad4s1f[WRITE(offset=216338284544, length=16384)]error = 5
>
> I could still login to the machine (after an eternity) but got lots of
> read/write errors along the way.  The offset shown in the errors kept
> changing, so I thought it was a hardware eSATA controller issue
> instead of a bad sector on the drive -  I replaced the motherboard,
> but the problem persisted. So I bought a new hard drive and have
> re-installed FreeBSD 7.2 on it. I'd like to plug in the old hard drive
> today, mount it and salvage as much as I can... especially the
> database files, config files, etc.
>
> My question: what kind of checks and/or repair tools should I run on
> the damaged drive after it's mounted? Or should I mount it as
> read-only and start backing it up? I am hoping most of my data is
> still there, but also don't want to damage it further. I desperately
> need to salvage the data, what do the kind people on this list
> recommend?
>
> thanks,
> kelly


If I were you, get a copy of spinrite (from grc.com) and always keep
it handy.  It can be risky on a drive already failing.  Here's what
I'd do....

Buy spinrite, no matter what.

slave the bad drive, read-only mount..  even if the FS is dirty,
read-only.. no fsck.
copy the data you can (if any).
reboot and run spinrite on the bad drive, deepest analysis (level 4 or
5) [may take days, weeks or even reports of months]
re-slave the bad drive to the system, fsck and mount read-only.
compare and copy any additional data, if any/if applicable, you can.

Scrap/destroy the drive if it has sensitive data.  I crack open the
drive and dismantle the HDD platters from the spindle, break the
read-write head ribbon cable, and remove the circuit board on the
drive when I destroy drives.

Each component should be recycled (being the responsible citizen),
maybe on separate runs to remove the possibility of someone nosy
getting into your stuff.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ade45ae90908241313y495832edkd87004485602a42e>