From owner-freebsd-questions@FreeBSD.ORG  Thu Nov 11 17:56:50 2004
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9379716A4CE
	for <freebsd-questions@freebsd.org>;
	Thu, 11 Nov 2004 17:56:50 +0000 (GMT)
Received: from wrongcrowd.com (dsl231-043-085.sea1.dsl.speakeasy.net
	[216.231.43.85])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DD11043D2F
	for <freebsd-questions@freebsd.org>;
	Thu, 11 Nov 2004 17:56:49 +0000 (GMT)
	(envelope-from matt@wrongcrowd.com)
Received: from [192.168.1.95] (port=1726 helo=tbird.wrongcrowd.com)
	by wrongcrowd.com with esmtp (Exim 4.34 (FreeBSD))
	id 1CSJAx-0004Ma-0U; Thu, 11 Nov 2004 09:56:04 -0800
Message-Id: <6.1.2.0.2.20041110235120.0b1a5e20@mail.speakeasy.net>
X-Mailer: QUALCOMM Windows Eudora Version 6.1.2.0
Date: Thu, 11 Nov 2004 09:54:52 -0800
To: freebsd-questions@freebsd.org
From: Matt Staroscik <matt@wrongcrowd.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-Spam-Score: 0.0 (/)
cc: Jerry McAllister <jerrymc@clunix.cl.msu.edu>
Subject: Bad blocks & 3ware RAID rebuild -- resolution
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Nov 2004 17:56:50 -0000

I'm answering myself since (with the help of the list) I got my problem 
solved. I hope this summary helps a future searcher. Thanks to Jerry and 
everyone else who posted.

Summary of problem: My RAID mirror (3ware 7000-2) lost a drive, and it was 
failing to rebuild with a new replacement disk (under 4.10). I was not sure 
why the rebuild failed, due to a vague "disk error" message.

At this time I dumped the filesystems to a spare drive, but dump reported a 
read error on /usr. I fsck'd, but this did not eliminate the problem. 
Great, my degraded RAID's one "good" disk is having trouble!

It was pointed out to me that fsck would not fix bad blocks. I had to try 
something else to repair the disk. I located a corrupt file by tarring up 
/usr and waiting for trouble... A forgotten core dump showed a read error. 
I deleted it, and a subsequent dump of /usr did not report errors. 
Progress! However, the RAID would still not rebuild.

It turns out that an IDE drive will not remap a bad sector when READ... it 
must be WRITTEN. So, after I deleted the file that (presumably) sat on the 
bad sectors I filled up /usr with misc files. Afterwards, I checked the 
disk SMART error log and sure enough, it showed some new bad sectors had 
been remapped. (I checked disk 0 on the RAID with smartctl -a -d 3ware,0 
/dev/twed0)

I deleted the temporary files and initiated a RAID rebuild in the 3dm web 
interface. Before the sector repair, it would die after just a couple of 
minutes, but after the sector repair the rebuild completed successfully.

SMART still reports that the drive is overall healthy... I am not sure how 
many spare sectors are left but I have a working mirror again so I have 
bought myself some time.

Thanks again.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
matt@wrongcrowd.com * KF6IYW * http://wrongcrowd.com
"I am Matt Staroscik and I approved this message."