From owner-freebsd-questions@FreeBSD.ORG Tue Sep 6 10:51:03 2005 Return-Path: X-Original-To: questions@freebsd.org Delivered-To: freebsd-questions@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D030216A470; Tue, 6 Sep 2005 10:51:03 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5199743D45; Tue, 6 Sep 2005 10:51:03 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by cyrus.watson.org (Postfix) with ESMTP id CED6D46B2D; Tue, 6 Sep 2005 06:51:02 -0400 (EDT) Date: Tue, 6 Sep 2005 11:51:02 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Mikhail Teterin In-Reply-To: <200509051953.22337@aldan> Message-ID: <20050906114055.R51625@fledge.watson.org> References: <200509051953.22337@aldan> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: questions@freebsd.org, fs@freebsd.org Subject: Re: Strange case of filesystem corruption? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Sep 2005 10:51:04 -0000 On Mon, 5 Sep 2005, Mikhail Teterin wrote: > Can this be explained by anything other than a (nasty) bug? > > % ls -la audio/shorten/files > total 0 > % rmdir audio/shorten/files > rmdir: audio/shorten/files: Directory not empty > > This is on 5.4-stable from July 21 -- up ever since... Thanks! Mikhail, Have you recently experienced a system crash or hard reboot without proper shutdown? I know of at least one case prior to 6.x where this can occur -- a bug I reported to Kirk relating to bgfsck. Basically, soft update guarantees that upon reboot after failure (power or otherwise), the on-disk layout of UFS meta-data will be consistent, except with respect to reference and link counts, which may be elevated. What bgfsck does is walk the file system to identify and correct the elevated counts. Here's a specific example of such a problem: Directory A is created Write A directory data (., ..) Write A directory inode (new inode) Write A parent inode (link count++) Write A parent data (add name) Directory B is created Write B directory data (., ..) Write B directory inode (new inode) Write A directory inode (link count++) XXX Write A directory data (add name) Directory B is removed Write A directory data (remove name) XXX Write A directory inode (link count--) Note that if the sequence of events his halted at either of the XXX's above, the inode link count on directory A will be elevated, even though the name for B has been removed from A. Background fsck comes alone later, notices that the counts are elevated, and drops them. However, until ufs_vnops.c:1.269, this caused a problem: because the link count was elevated, UFS assumed that the directory contained a reference to another directory, and would not let it be removed. Once bgfsck catches up with the directory, it can be removed. I've seen this symptom most frequently following a crash or a power outage during an rm -Rf of a /usr/obj, which I then immediately restart on reboot, and rm -Rf gets there before bgfsck does. Here's the commit message for ufs_vnops.c:1.269, which should be MFC'd: revision 1.269 date: 2005/05/18 22:18:21; author: mckusick; state: Exp; lines: +2 -3 Allow removal of empty directories with high link counts. These can occur on a filesystem running with soft updates after a crash and before a background fsck has been run. To prevent discrepancies from arising in a background fsck that may already be running, the directory is removed but its inode is not freed and is left with the residual reference count. When encountered by the background fsck it will be reclaimed. I'll e-mail Kirk and ask if he's comfortable enough with the change to this point to merge it. Robert N M Watson