From owner-freebsd-current@FreeBSD.ORG Sat Sep 24 07:41:45 2005 Return-Path: X-Original-To: current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9886116A41F; Sat, 24 Sep 2005 07:41:45 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 49BB043D4C; Sat, 24 Sep 2005 07:41:45 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id j8O7fZca090425; Sat, 24 Sep 2005 00:41:39 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200509240741.j8O7fZca090425@gw.catspoiler.org> Date: Sat, 24 Sep 2005 00:41:35 -0700 (PDT) From: Don Lewis To: Tor Egge In-Reply-To: <20050924.043419.74681996.Tor.Egge@cvsup.no.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: scottl@FreeBSD.org, current@FreeBSD.org, mckusick@FreeBSD.org Subject: Re: soft updates / background fsck directory link count bug X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Sep 2005 07:41:45 -0000 On 24 Sep, Tor Egge wrote: >> It appears that there is some sort of interaction between soft >> updates and background fsck that results in the link count of the >> parent of one of these directories being double decremented, >> resulting in the file system being put into an invalid state. > > If the snapshot for background fsck is taken on a file system which > has pending softupdate dependencies then this can happen. For this > particular case, the system had a pending dirrem dependency. In this particular case, the dependency seems to remain pending forever. I executed sync(8), waited for 60+ seconds, and manually typed in a couple of commands before running fsck. Only the the file system modifications done by fsck -B or unmounting the file system seem to flush the parent directory link count update to disk. >> The following transcript demonstrates what happens if a background >> fsck is run after the leaf directory is removed. What is interesting >> is that after the directory the leaf directory has been removed, the >> effective link count of the parent directory (displayed by ls) has >> been decremented from 3 to 2, whereas the on-disk link count shown by >> fsdb is still 3. The background fsck appears to detect the link >> count as 3, and executes the sysctl call to decrement the link count, >> causing both the effective and actual link counts to be decremented >> to 1. > >> My suspicion is that the physical update of the parent directory's >> link count after the rmdir of the leaf directory has been deferred >> until the leaf directory's inode is zeroed, which turns out to be an >> indefinite wait because the inode doesn't get zeroed until fsck is >> run. > > ufs_rmdir() calls ufs_dirremove() after having lowered i_effnlink in > memory for both leaf and parent directory. > > ufs_dirremmove() calls softdep_setup_remove() which sets up the > softupdates dependencies for reducing di_nlink on disk for leaf and > parent directory when it's safe to do so (i.e. after the directory > entry referencing the leaf directory has been cleared on disk). See > code in reassignbuf() for various delays before the syncer process > pushes the dirty buffers to disk. Ok, now I see the code at the end of handle_workitem_remove() that reuses the struct dirrem to decrement the link count of the parent directory. For some reason the second handle_workitem_remove() call is getting deferred indefinitely. > The background fsck found the the di_nlink value being 3 on the parent > directory and issued an FFS_ADJ_REFCNT sysctl to reduce it by one, > having no knowledge about the pending dirrem dependency. See > sysctl_ffs_fsck() for the handling of that sysctl. > > After background fsck has run and the dirrem dependency has been > processed, the link counts for the parent directory are both 1. Yup. Even without the indefinite deferral problem, it seems to me that updating either file or directory link counts in background fsck is hazardous unless the directory slot updates and link count updates can be guaranteed to be consistent in the snapshot. > The latest panic shown on > , "panic: > handle_written_inodeblock: live inodedep" was probably caused by this > issue. If the snapshot was taken while a directory or file was being > removed then it might contain an unreferenced inode with a nonzero > link count. The background fsck would reduce the link count for the > inode, triggering freeing of the inode (c.f. ufs_inactive(), > UFS_VFREE(), ffs_vfree() and softdep_freefile()). After writing the > zeroed inode to disk the system would panic due to the still pending > dirrem dependency. My investigation of that particular problem led me to try this experiment. I actually haven't been able to reproduce the handle_written_inodeblock panic, but I've been able to reproduce the deadlock problem a number of times.