From owner-freebsd-questions@FreeBSD.ORG Mon Aug 24 20:13:23 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D2485106568C for ; Mon, 24 Aug 2009 20:13:23 +0000 (UTC) (envelope-from tajudd@gmail.com) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.240]) by mx1.freebsd.org (Postfix) with ESMTP id 564A98FC12 for ; Mon, 24 Aug 2009 20:13:23 +0000 (UTC) Received: by an-out-0708.google.com with SMTP id d14so855287and.13 for ; Mon, 24 Aug 2009 13:13:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=PmaHwz/Un+/a/+k7OHsLPKxQsZGqupKYPfkkGhNs0UM=; b=jESEeCUSDa4+/MFRH2adp+yYVIxCdc0r211ArGIzLRTD6Zpqx/hoCQPN5tHbvl9Z4F 1y8cPYeWiksiNwGmbpFswl2iD5KOA4jrnZM3MA9y3wc+x1H+ikNFuWRPoiuQazYQXJk4 /RrvH17MPKlwIU3Q35x6u3NY5FSavfvuT06dk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=r1qgNy+wxmFWeiWBFSLAV0F8d7tOaLjw7Er9qC7pjnWWSaw9qpb2XmEsRPtOQiv6IN gOJ3wsfg8Nh7Zp92nbOLLwfbUu+A/jGaEwezcefgJYcZDpTu/3co63R2t+ga8WT7aTs3 aFTkmYl8daWrHx26n9NWzNPu0O2LFiq+ZnrxA= MIME-Version: 1.0 Received: by 10.101.75.20 with SMTP id c20mr5213337anl.42.1251144802363; Mon, 24 Aug 2009 13:13:22 -0700 (PDT) In-Reply-To: <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com> References: <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com> Date: Mon, 24 Aug 2009 14:13:22 -0600 Message-ID: From: Tim Judd To: Kelly Martin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD Questions Subject: Re: hard disk failure - now what? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Aug 2009 20:13:24 -0000 On 8/24/09, Kelly Martin wrote: > I just experienced a hard drive failure on one of my FreeBSD 7.2 > production servers with no backup! I am so mad at myself for not > backing up!! Now it's a salvage operation. Here are the type of errors > I was getting on the console, over-and-over: > > ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=441633503 > ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - > completing request directly > ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - > completing request directly > ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly > ad4: FAILURE - WRITE_DMA48 timed out LBA=441633375 > g_vgs_done():ad4s1f[WRITE(offset=216338284544, length=16384)]error = 5 > > I could still login to the machine (after an eternity) but got lots of > read/write errors along the way. The offset shown in the errors kept > changing, so I thought it was a hardware eSATA controller issue > instead of a bad sector on the drive - I replaced the motherboard, > but the problem persisted. So I bought a new hard drive and have > re-installed FreeBSD 7.2 on it. I'd like to plug in the old hard drive > today, mount it and salvage as much as I can... especially the > database files, config files, etc. > > My question: what kind of checks and/or repair tools should I run on > the damaged drive after it's mounted? Or should I mount it as > read-only and start backing it up? I am hoping most of my data is > still there, but also don't want to damage it further. I desperately > need to salvage the data, what do the kind people on this list > recommend? > > thanks, > kelly If I were you, get a copy of spinrite (from grc.com) and always keep it handy. It can be risky on a drive already failing. Here's what I'd do.... Buy spinrite, no matter what. slave the bad drive, read-only mount.. even if the FS is dirty, read-only.. no fsck. copy the data you can (if any). reboot and run spinrite on the bad drive, deepest analysis (level 4 or 5) [may take days, weeks or even reports of months] re-slave the bad drive to the system, fsck and mount read-only. compare and copy any additional data, if any/if applicable, you can. Scrap/destroy the drive if it has sensitive data. I crack open the drive and dismantle the HDD platters from the spindle, break the read-write head ribbon cable, and remove the circuit board on the drive when I destroy drives. Each component should be recycled (being the responsible citizen), maybe on separate runs to remove the possibility of someone nosy getting into your stuff.