From owner-freebsd-questions@FreeBSD.ORG Thu Nov 11 17:56:50 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9379716A4CE for ; Thu, 11 Nov 2004 17:56:50 +0000 (GMT) Received: from wrongcrowd.com (dsl231-043-085.sea1.dsl.speakeasy.net [216.231.43.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id DD11043D2F for ; Thu, 11 Nov 2004 17:56:49 +0000 (GMT) (envelope-from matt@wrongcrowd.com) Received: from [192.168.1.95] (port=1726 helo=tbird.wrongcrowd.com) by wrongcrowd.com with esmtp (Exim 4.34 (FreeBSD)) id 1CSJAx-0004Ma-0U; Thu, 11 Nov 2004 09:56:04 -0800 Message-Id: <6.1.2.0.2.20041110235120.0b1a5e20@mail.speakeasy.net> X-Mailer: QUALCOMM Windows Eudora Version 6.1.2.0 Date: Thu, 11 Nov 2004 09:54:52 -0800 To: freebsd-questions@freebsd.org From: Matt Staroscik Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Spam-Score: 0.0 (/) cc: Jerry McAllister Subject: Bad blocks & 3ware RAID rebuild -- resolution X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2004 17:56:50 -0000 I'm answering myself since (with the help of the list) I got my problem solved. I hope this summary helps a future searcher. Thanks to Jerry and everyone else who posted. Summary of problem: My RAID mirror (3ware 7000-2) lost a drive, and it was failing to rebuild with a new replacement disk (under 4.10). I was not sure why the rebuild failed, due to a vague "disk error" message. At this time I dumped the filesystems to a spare drive, but dump reported a read error on /usr. I fsck'd, but this did not eliminate the problem. Great, my degraded RAID's one "good" disk is having trouble! It was pointed out to me that fsck would not fix bad blocks. I had to try something else to repair the disk. I located a corrupt file by tarring up /usr and waiting for trouble... A forgotten core dump showed a read error. I deleted it, and a subsequent dump of /usr did not report errors. Progress! However, the RAID would still not rebuild. It turns out that an IDE drive will not remap a bad sector when READ... it must be WRITTEN. So, after I deleted the file that (presumably) sat on the bad sectors I filled up /usr with misc files. Afterwards, I checked the disk SMART error log and sure enough, it showed some new bad sectors had been remapped. (I checked disk 0 on the RAID with smartctl -a -d 3ware,0 /dev/twed0) I deleted the temporary files and initiated a RAID rebuild in the 3dm web interface. Before the sector repair, it would die after just a couple of minutes, but after the sector repair the rebuild completed successfully. SMART still reports that the drive is overall healthy... I am not sure how many spare sectors are left but I have a working mirror again so I have bought myself some time. Thanks again. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= matt@wrongcrowd.com * KF6IYW * http://wrongcrowd.com "I am Matt Staroscik and I approved this message."