From owner-freebsd-current@FreeBSD.ORG Sat Jun 12 15:15:52 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 877A616A4CE for ; Sat, 12 Jun 2004 15:15:52 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id F1F8C43D45 for ; Sat, 12 Jun 2004 15:15:51 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i5CFDn8m090253; Sat, 12 Jun 2004 11:13:49 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i5CFDnBX090250; Sat, 12 Jun 2004 11:13:49 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Sat, 12 Jun 2004 11:13:49 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Kris Kennaway In-Reply-To: <20040612131142.GB43669@xor.obsecurity.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@FreeBSD.org Subject: Re: bg fsck and fs corruption X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Jun 2004 15:15:52 -0000 On Sat, 12 Jun 2004, Kris Kennaway wrote: > phk's sparc panicked while I was using it for package building, and > since he had forgotten to disable bg fsck I was reminded again of why I > turn it off on all my other systems: Elevated reference counts on directories are a feature/bug I've pointed out to Kirk previously; he indicated an intent to provide a work-around, but I believe never provided one. Basically, this is a property of soft updates accepting elevated reference counts and counting unallocated storage as allocated following a crash. bgfsck should find and clean this up, but it may take a while for bgfsck to complete its scan to the point where it's reached a particular directory. I tend to run into this if I do a build, then rm -Rf the object tree, and halt the system during the removal. Occasionally a sub-directory will be unlinked but the refcount drop on the directory inode will not have gotten to disk when the system stopped. bgfsck is intended to locate this, then drop the reference count. Kirk had in mind a couple of work-arounds, such as doing an extra check in the removal as to whether the directory was empty, and allowing the unlink to succeed if there were no entries in the directory but the refcount was still non-zero. If you allow bgfsck to complete, does it eventually clean this up? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research > > twinsun# rm -rf old > rm: old/26422/usr/local/lib: Directory not empty > rm: old/26422/usr/local: Directory not empty > rm: old/26422/usr: Directory not empty > rm: old/26422/var/tmp/instmp.laCtQf/lib/perl5/5.8.4/mach/auto/threads: Directory not empty > rm: old/26422/var/tmp/instmp.laCtQf/lib/perl5/5.8.4/mach/auto: Directory not empty > rm: old/26422/var/tmp/instmp.laCtQf/lib/perl5/5.8.4/mach: Directory not empty > rm: old/26422/var/tmp/instmp.laCtQf/lib/perl5/5.8.4: Directory not empty > rm: old/26422/var/tmp/instmp.laCtQf/lib/perl5: Directory not empty > rm: old/26422/var/tmp/instmp.laCtQf/lib: Directory not empty > rm: old/26422/var/tmp/instmp.laCtQf: Directory not empty > rm: old/26422/var/tmp: Directory not empty > rm: old/26422/var: Directory not empty > rm: old/26422: Directory not empty > rm: old: Directory not empty > twinsun# ls -l old/26422/usr/local/lib > total 0 > > bg fsck noticed the usual softdep problems, but did not report or fix > the corruption: > > [...] > Jun 12 07:38:47 twinsun fsck: /dev/da1c: INCORRECT BLOCK COUNT I=4381849 (4 should be 0) (CORRECTED) > Jun 12 07:38:47 twinsun fsck: /dev/da1c: INCORRECT BLOCK COUNT I=4381850 (4 should be 0) (CORRECTED) > Jun 12 07:38:47 twinsun fsck: /dev/da1c: INCORRECT BLOCK COUNT I=4381853 (4 should be 0) (CORRECTED) > Jun 12 07:38:47 twinsun fsck: > > Note the lack of summary line. I don't know if it was trying to log > the more serious corruption but didn't because of a bug, or if it just > didn't detect it. > > Kris >