From owner-freebsd-questions@FreeBSD.ORG Fri Jul 15 22:39:44 2005 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7733316A41C for ; Fri, 15 Jul 2005 22:39:44 +0000 (GMT) (envelope-from nb@ravenbrook.com) Received: from raven.ravenbrook.com (raven.ravenbrook.com [193.82.131.18]) by mx1.FreeBSD.org (Postfix) with ESMTP id AC95743D5F for ; Fri, 15 Jul 2005 22:39:41 +0000 (GMT) (envelope-from nb@ravenbrook.com) Received: from thrush.ravenbrook.com (thrush.ravenbrook.com [193.112.141.145]) by raven.ravenbrook.com (8.12.6p3/8.12.6) with ESMTP id j6FMddXi007203; Fri, 15 Jul 2005 23:39:39 +0100 (BST) (envelope-from nb@ravenbrook.com) Received: from thrush.ravenbrook.com (localhost [127.0.0.1]) by thrush.ravenbrook.com (8.12.9p2/8.12.9) with ESMTP id j6FMdcFM042571; Fri, 15 Jul 2005 23:39:38 +0100 (BST) (envelope-from nb@thrush.ravenbrook.com) From: Nick Barnes To: Chuck Swiger In-Reply-To: <42D7EBDE.8030807@mac.com> from Chuck Swiger of "Fri, 15 Jul 2005 13:01:18 -0400" Date: Fri, 15 Jul 2005 23:39:38 +0100 Message-ID: <42570.1121467178@thrush.ravenbrook.com> Sender: nb@ravenbrook.com Cc: freebsd-questions@freebsd.org Subject: identifying filesystem blocks (was Re: better disk reliability on a desktop machine) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2005 22:39:44 -0000 At 2005-07-15 17:01:18+0000, Chuck Swiger writes: > Nick Barnes wrote: > [ ... ] > > I don't want to have to do all that ever again, after this iteration. > > You've had a learning experience, I see. :-) Yeah, and I've had them before, and this time enough is enough. On a related subject, the last time I lost a disk, or maybe the time before, I asked on one of these lists whether there is a tool which will identify the files (or inodes, or other filesystem metadata) which are affected by one or more bad blocks. At the time I was told that there is no such tool, and started to write my own. Maybe this time around I'll finish the tool and distribute it. Semi-automated binary-chop use of dd tells me that the following blocks in my filesystem are broken: 65255940, 65255941, 65255942, 65255943, 65255944, 65255954, 65255965, 65256256, 65257133, 65257134, 65257514, 66713152, 66713158, 66713164, 66713536, 66713537, 66714306, 66714308, 66715648, 66715650 but without a suitable tool this information is useless. Incidentally, two weeks ago I recovered a broken filesystem on a 4.10 server machine by dd'ing the working sectors (i.e. all but 2) onto a freshly newfs'ed partition. The broken filesystem wouldn't fsck at all: some metadata was lost to a bad sector and fsck borked out in phase 2. But after the dd's (i.e. with those bad sectors replaced with metadata fresh from newfs), fsck told me that the recovered filesystem was fine. As it happens, the filesystem was the repository for an SCM system (Perforce) with internal checksums: after recovery we checked those out and they all passed. One interesting aspect of that war story is that I got one of the dd commands wrong the first time, and tried to fsck a filesystem which was partly Just Plain Missing. The whole system went down: network connections dropped and completely unresponsive at console, including ctrl-C, ctrl-T, alt-Fn, and ctrl-alt-del. It seems to me that fsck shouldn't be able to do that.... Nick Barnes