From owner-freebsd-questions@FreeBSD.ORG  Mon Aug 24 20:25:29 2009
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 41724106568F
	for <freebsd-questions@freebsd.org>;
	Mon, 24 Aug 2009 20:25:29 +0000 (UTC)
	(envelope-from freebsd-questions-local@be-well.ilk.org)
Received: from mail3.sea5.speakeasy.net (mail3.sea5.speakeasy.net
	[69.17.117.5]) by mx1.freebsd.org (Postfix) with ESMTP id 1B13D8FC0A
	for <freebsd-questions@freebsd.org>;
	Mon, 24 Aug 2009 20:25:28 +0000 (UTC)
Received: (qmail 10177 invoked from network); 24 Aug 2009 20:25:28 -0000
Received: from dsl092-078-145.bos1.dsl.speakeasy.net (HELO be-well.ilk.org)
	([66.92.78.145])
	(envelope-sender <freebsd-questions-local@be-well.ilk.org>)
	by mail3.sea5.speakeasy.net (qmail-ldap-1.03) with SMTP
	for <freebsd-questions@freebsd.org>; 24 Aug 2009 20:25:28 -0000
Received: by be-well.ilk.org (Postfix, from userid 1147)
	id B2A1F5082F; Mon, 24 Aug 2009 16:25:26 -0400 (EDT)
To: Kelly Martin <kellymartin@gmail.com>
References: <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com>
From: Lowell Gilbert <freebsd-questions-local@be-well.ilk.org>
Date: Mon, 24 Aug 2009 16:25:26 -0400
In-Reply-To: <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com>
	(Kelly Martin's message of "Mon\, 24 Aug 2009 12\:29\:19 -0600")
Message-ID: <44y6p9q7rd.fsf@be-well.ilk.org>
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (berkeley-unix)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: FreeBSD Questions <freebsd-questions@freebsd.org>
Subject: Re: hard disk failure - now what?
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: FreeBSD Questions <freebsd-questions@freebsd.org>
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 24 Aug 2009 20:25:29 -0000

Kelly Martin <kellymartin@gmail.com> writes:

> I just experienced a hard drive failure on one of my FreeBSD 7.2
> production servers with no backup! I am so mad at myself for not
> backing up!! Now it's a salvage operation. Here are the type of errors
> I was getting on the console, over-and-over:
>
> ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=441633503
> ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout -
> completing request directly
> ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout -
> completing request directly
> ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
> ad4: FAILURE - WRITE_DMA48 timed out LBA=441633375
> g_vgs_done():ad4s1f[WRITE(offset=216338284544, length=16384)]error = 5
>
> I could still login to the machine (after an eternity) but got lots of
> read/write errors along the way.  The offset shown in the errors kept
> changing, so I thought it was a hardware eSATA controller issue
> instead of a bad sector on the drive -  I replaced the motherboard,
> but the problem persisted. So I bought a new hard drive and have
> re-installed FreeBSD 7.2 on it. I'd like to plug in the old hard drive
> today, mount it and salvage as much as I can... especially the
> database files, config files, etc.
>
> My question: what kind of checks and/or repair tools should I run on
> the damaged drive after it's mounted? Or should I mount it as
> read-only and start backing it up? I am hoping most of my data is
> still there, but also don't want to damage it further. I desperately
> need to salvage the data, what do the kind people on this list
> recommend?

First, try copying the entire disk, *without* mounting it.  Use dd(1) to
get a copy of the whole disk.  I believe that "conv=noerror" may be necessary.

-- 
Lowell Gilbert, embedded/networking software engineer, Boston area
		http://be-well.ilk.org/~lowell/