From owner-freebsd-current@FreeBSD.ORG Fri Apr 15 17:06:00 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6118F16A4CE; Fri, 15 Apr 2005 17:06:00 +0000 (GMT) Received: from peter-laptop.wemm.org (p182.n-lapop01.stsn.com [12.129.240.182]) by mx1.FreeBSD.org (Postfix) with ESMTP id EB6DB43D45; Fri, 15 Apr 2005 17:05:59 +0000 (GMT) (envelope-from peter@wemm.org) Received: from evilpete.dyndns.org (localhost [127.0.0.1]) by peter-laptop.wemm.org (8.13.3/8.13.3) with ESMTP id j3FH5RYd007762; Fri, 15 Apr 2005 10:05:27 -0700 (PDT) (envelope-from peter@wemm.org) Received: from localhost (localhost [[UNIX: localhost]]) by evilpete.dyndns.org (8.13.3/8.13.3/Submit) id j3FH5QmH007761; Fri, 15 Apr 2005 10:05:26 -0700 (PDT) (envelope-from peter@wemm.org) X-Authentication-Warning: evilpete.dyndns.org: peter set sender to peter@wemm.org using -f From: Peter Wemm To: freebsd-current@freebsd.org Date: Fri, 15 Apr 2005 10:05:25 -0700 User-Agent: KMail/1.8 References: <200504121201.j3CC1nZ1035643@gw.catspoiler.org> In-Reply-To: <200504121201.j3CC1nZ1035643@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200504151005.25999.peter@wemm.org> cc: Don Lewis cc: current@freebsd.org cc: kris@obsecurity.org Subject: Re: Softupdates not preventing lengthy fsck X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Apr 2005 17:06:00 -0000 On Tuesday 12 April 2005 05:01 am, Don Lewis wrote: > On 11 Apr, Kris Kennaway wrote: > > On Mon, Apr 11, 2005 at 06:43:17PM -0700, Don Lewis wrote: > >> On 11 Apr, Kris Kennaway wrote: > >> > I'm seeing the following problem: on 6.0 machines which have had a lot > >> > of FS activity in the past but are currently quiet, an unclean reboot > >> > will require an hour or more of fscking and will end up clearing > >> > thousands of inodes: > >> > > >> > [...] > >> > /dev/da0s1e: UNREF FILE I=269731 OWNER=root MODE=100644 > >> > /dev/da0s1e: SIZE=8555 MTIME=Apr 18 02:29 2002 (CLEARED) > >> > > >> > /dev/da0s1e: UNREF FILE I=269741 OWNER=root MODE=100644 > >> > [...] > >> > > >> > It's as if dirty buffers aren't being written out properly, or > >> > something. Has anyone else seen this? > >> > >> This looks a lot like it could be a vnode refcnt leak. Files won't get > >> removed from the disk while they are still in use (the old unlink while > >> open trick). Could nullfs be a factor? > > > > Yes, I make extensive use of read-only nullfs. > > > > Kris (fsck still running) > > It would also be interesting to find out why fsck is taking so long to > run. I don't see anything obvious in the code. One HUGE time factor in a fsck run is serial consoles. Printing tens or hundreds of thousands of inode corrections at 9600 baud takes forever. At work, we found that some fsck runs that would take 20+ hours could be reduced to 15-20 minutes by simply redirecting fsck output to /dev/null instead of the serial console. At work, we experimented with a memory based logging process that buffered up its stdin and waited until the fs was writeable. eg: fsck -p 2>&1 | memlogger /var/log/fsck.log Memlogger would malloc memory to hold fsck's output and periodically poll for /var/log to become writable. (There was more to it than that, and I'm not sure that we figured out all the quirks to make it usable in the /etc/rc environment) -Peter