From owner-freebsd-questions@FreeBSD.ORG  Fri Apr  9 09:42:40 2004
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 4D20F16A4CE
	for <freebsd-questions@freebsd.org>;
	Fri,  9 Apr 2004 09:42:40 -0700 (PDT)
Received: from mail.datausa.com (mail.datausa.com [216.150.220.134])
	by mx1.FreeBSD.org (Postfix) with SMTP id DD45743D54
	for <freebsd-questions@freebsd.org>;
	Fri,  9 Apr 2004 09:42:39 -0700 (PDT)
	(envelope-from freebsd@wcubed.net)
Received: (qmail 39922 invoked from network); 9 Apr 2004 16:31:15 -0000
Received: from c-24-9-172-8.client.comcast.net (HELO wcubed.net) (@24.9.172.8)
  by mail.datausa.com with SMTP; 9 Apr 2004 16:31:15 -0000
Message-ID: <4076D2D4.2030509@wcubed.net>
Date: Fri, 09 Apr 2004 10:44:04 -0600
From: Brad Waite <freebsd@wcubed.net>
User-Agent: Mozilla Thunderbird 0.5 (Windows/20040207)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-questions@freebsd.org
References: <3283.24.9.172.8.1080516356.squirrel@webmail.datausa.com>
In-Reply-To: <3283.24.9.172.8.1080516356.squirrel@webmail.datausa.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: hard disk recover
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Apr 2004 16:42:40 -0000

freebsd@wcubed.net wrote:
> I'm getting the dreaded "ad1s1a: hard error reading fsbn 524543 of 96-127
> (ad1s1 bn 524543; cn 520 tn 6 sn 5) status=59 error=40" errors.  Based on
> what I've read, it means my drive's going bye-bye.  As it is, it won't
> even boot - fortunately I have another FBSD drive to boot from, and I get
> these errors while trying fsck it.  Shame on me for not noticing the
> errors sooner and an even bigger shame for not having a proper backup.
> 
> In any case, the milk is spilled and I need to mop it up as best I can. 
> While I can mount the partition, I can't cd to it (more "hard errors..."),
> and since fsck isn't apparently helping, what can I do to recover what's
> left?  I'm thinking dd's the tool to use, but I'm not really sure how to
> go about it.  Here's what I get when I try to read from the beginning on
> the partition:
> 
> # dd if=/dev/ad1s1a bs=64k
> dd: /dev/ad1s1a: Input/output error
> 
> However, when I add "skip=1", the drive spits back data.  That leads me to
> believe that if I skip over the bad sectors, I can read what's left.
> 
> I've got a spare drive I can use as a sandbox, but how should I dump the
> data?  Should I label the second drive with the same partition size and
> "dd if=/dev/ad1s1a of=/dev/ad2s1a"?  Is there any chance of recovering
> filesystem data going this route?

[Quoting myself as it's been 2 weeks since the first post]

Here's what's new:

ad0: 21557MB <IBM-DJNA-372200> [43800/16/63] at ata0-master UDMA66
ad1: 39083MB <Maxtor 5T040H4> [79408/16/63] at ata0-slave UDMA100
ad2: 29311MB <Maxtor 5T030H3> [59554/16/63] at ata1-master UDMA100

ad2 is the 30GB drive reporting errors; ad1 is the new 40GB drive I 
copied the partition to.

I tried to fdisk the 40G to be identical to the 30G, but I could never 
get the size to match exactly.  In the end, I just set up the 256M swap, 
and hoped the 524288 offset for the 'a' partition would work. Here's 
relevant disklabel output:

# disklabel -r /dev/ad1s1
# /dev/ad1s1:
type: ESDI
disk: ad0s1
label:
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 4981
sectors/unit: 80035767
[...]
8 partitions:
#        size offset   fstype   [fsize bsize bps/cpg]
   a: 79511479 524288   4.2BSD     2048 16384    89  # (Cyl. 32*- 4981*)
   b:   524288      0     swap                       # (Cyl.  0 - 32*)
   c: 80035767      0   unused        0     0        # (Cyl.  0 - 4981*)

# disklabel -r /dev/ad2s1
# /dev/ad2s1:
type: ESDI
disk: ad0s1
label:
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 59553
sectors/unit: 60030369
[...]
8 partitions:
#        size offset    fstype  [fsize bsize bps/cpg]
   a: 59506081 524288    4.2BSD   2048 16384    16  # (Cyl. 520*- 59553*)
   b:   524288      0      swap                     # (Cyl.   0 - 520*)
   c: 60030369      0    unused      0     0        # (Cyl.   0 - 59553*)

I used lewiz' suggestion to add 'conv=noerror,sync' to dd. I was able to 
copy the readable data from the bad drive to a new one.  I changed it to 
bs=512b (redundant, I know) since if the old disk was bad on 512-byte 
block 0, I figured dd would skip to the next 64k.  Here's what I used:

dd if=/dev/ad2s1a of=/dev/ad1s1a conv=noerror,sync bs=512b

Of course, I got about 165 "ad2s1a: hard error reading fsbn ..." errors, 
but it appeared to copy everything else okay.  The first 16 blocks of 
ad2s1a are null, but there is 16 blocks of data at block 32, so it 
appears the first backup superblock survived.

Is there a remote chance that I'll be able to fsck this fs and recover? 
  I know that fsck will complain about the first alternate superblock 
not matching because the last superblock won't be in the first 30GB.  Do 
the different sized partitions make this impossible?