From owner-freebsd-current@FreeBSD.ORG Tue Apr 12 15:01:48 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 28D2016A4DD; Tue, 12 Apr 2005 15:01:48 +0000 (GMT) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1C18143D48; Tue, 12 Apr 2005 15:01:45 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.11] (junior-wifi.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.1/8.13.1) with ESMTP id j3CF4xkI038127; Tue, 12 Apr 2005 09:04:59 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <425BE215.4090406@samsco.org> Date: Tue, 12 Apr 2005 08:58:29 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.5) Gecko/20050218 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Kris Kennaway References: <20050412035111.GA31366@xor.obsecurity.org> <200504121201.j3CC1nZ1035643@gw.catspoiler.org> <20050412144116.GA39174@xor.obsecurity.org> In-Reply-To: <20050412144116.GA39174@xor.obsecurity.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on pooker.samsco.org cc: Don Lewis cc: current@freebsd.org Subject: Re: Softupdates not preventing lengthy fsck X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Apr 2005 15:01:48 -0000 Kris Kennaway wrote: > On Tue, Apr 12, 2005 at 05:01:49AM -0700, Don Lewis wrote: > >>On 11 Apr, Kris Kennaway wrote: >> >>>On Mon, Apr 11, 2005 at 06:43:17PM -0700, Don Lewis wrote: >>> >>>>On 11 Apr, Kris Kennaway wrote: >>>> >>>>>I'm seeing the following problem: on 6.0 machines which have had a lot >>>>>of FS activity in the past but are currently quiet, an unclean reboot >>>>>will require an hour or more of fscking and will end up clearing >>>>>thousands of inodes: >>>>> >>>>>[...] >>>>>/dev/da0s1e: UNREF FILE I=269731 OWNER=root MODE=100644 >>>>>/dev/da0s1e: SIZE=8555 MTIME=Apr 18 02:29 2002 (CLEARED) >>>> >>>>>/dev/da0s1e: UNREF FILE I=269741 OWNER=root MODE=100644 >>>>>[...] >>>>> >>>>>It's as if dirty buffers aren't being written out properly, or >>>>>something. Has anyone else seen this? >>>> >>>>This looks a lot like it could be a vnode refcnt leak. Files won't get >>>>removed from the disk while they are still in use (the old unlink while >>>>open trick). Could nullfs be a factor? >>> >>>Yes, I make extensive use of read-only nullfs. >>> >>>Kris (fsck still running) >> >>It would also be interesting to find out why fsck is taking so long to >>run. I don't see anything obvious in the code. > > > I can take a transcript of the entire fsck next time if you like :-) > (it ran for more than 5 hours on the 24G drive and was still going > after I went to bed) > > Kris Don might not know that your workload involves creating and deleting full ports/ trees repeatedly, and those trees contain hundreds of tousands of inodes each. If there is a reference count leak and those deletions aren't ever being finalized, then there would be a whole lot of work for fsck to do =-) Might also explain why disks have been unexpectedly filling up on package machines (like mine). Scott