From owner-freebsd-current@FreeBSD.ORG Mon Sep 1 19:53:44 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A40BE16A4BF; Mon, 1 Sep 2003 19:53:44 -0700 (PDT) Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net [207.217.120.139]) by mx1.FreeBSD.org (Postfix) with ESMTP id B081244015; Mon, 1 Sep 2003 19:53:43 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfj8a.dialup.mindspring.com ([165.247.205.10] helo=mindspring.com) by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19u1IX-0001QP-00; Mon, 01 Sep 2003 19:53:37 -0700 Message-ID: <3F5405CD.C5534CBF@mindspring.com> Date: Mon, 01 Sep 2003 19:51:57 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Pawel Worach References: <20030901165035.D58395@carver.gumbysoft.com> <3F53E2CA.9020101@freebsd.org> <3F53FD88.1050005@telia.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4cdf577ccfc4fcc733ef5a0e404789af0350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: freebsd-current@freebsd.org Subject: Re: swapon vs savecore dilemma X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Sep 2003 02:53:44 -0000 Pawel Worach wrote: > Is fsck really that memory heavy so that it needs swap? Yes, if you have a huge FS. The problem is that the checking of the CG bitmaps during an fsck require that you have all the bitmaps in core, and then linearly traverse the entire directory structure to identify which bits need to be cleared (to indicate that the block was deallocated prior to the crash, without the bitmap being successfully written out). Because a block allocation for any file can be written (effectively) anywhere on the disk, and there is no guarantee of cylinder group locality, this basically means that you have to hold all the bitmaps in memory simultaneously ...or you have to make multiple passes over the directory structure, with the number of passes being equal to the total number of CG's divided by the number of bitmaps you can keep in core simultaneously. This is hideously expensive, and it's never been implemented: you are assumed to have enough memory (or memory and swap) available to hold all the bitmaps in memory simultaneously. If you have a multiterabyte FS, passing over it 9 times instead of once with swapping would be extremely dissatisfying: presumably you have all that data for a reason, and need it back online as fast as possible. My suggestion (which has been my suggestion all along) is to add two date stamped CG bitmap bitmaps somewhere (my favorite place for this is to steal space at the front of inode 1, which is used only rarely, since people don't use the whiteout feature, and which can be made compatible with whiteouts, in any case). Then you don't let more than some small number be dirty simultaneously, without flushing some of them out (you would need an additional soft dependency to implement this). If you did this, then you could guarantee a smaller set of data to be simultaneously dirty, even for an arbitrarily large FS. You just load only those bitmaps that are marked dirty in the bitmap logs, and do a single pass through the full directory structure. > Wouldn't fsck -> mount -> savecore -> swapon be a more appropriate order? If you had small enough disks, large enough RAM, or could limit the number of CG bitmaps you had to simultaneously examine, then yes. Otherwise, no. -- Terry