From owner-freebsd-fs@FreeBSD.ORG Thu Feb 10 03:23:25 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7464E16A4CE for ; Thu, 10 Feb 2005 03:23:25 +0000 (GMT) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0574543D46 for ; Thu, 10 Feb 2005 03:23:25 +0000 (GMT) (envelope-from scottl@freebsd.org) Received: from [192.168.254.12] (g4.samsco.home [192.168.254.12]) (authenticated bits=0) by pooker.samsco.org (8.13.1/8.13.1) with ESMTP id j1A3NXUP013286; Wed, 9 Feb 2005 20:23:33 -0700 (MST) (envelope-from scottl@freebsd.org) Message-ID: <420AD3A7.3000102@freebsd.org> Date: Wed, 09 Feb 2005 20:23:19 -0700 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7) Gecko/20040514 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Loren M. Lang" References: <20050210030119.GD29396@alzatex.com> In-Reply-To: <20050210030119.GD29396@alzatex.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on pooker.samsco.org cc: freebsd-fs@freebsd.org Subject: Re: Journalling FS and Soft Updates comparision X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Feb 2005 03:23:25 -0000 Loren M. Lang wrote: > Traditionally, filesystems have been designed with the idea that the > data will always be written to disk safely and not much effort was put > into making then > > Journalling Filesystems and Soft Updates are two different techniques > designed to solve the problem of keeping data consistent even in the > case of a major interruption like a power blackout. Both work solely on > the meta data, not the real data. This isn't always true. There are journaling implementations that journal the data as well as the metadata. > This means increasing a file's size > is protected, but not neccessarily the data that's being written. (Does > this also mean that the data will be written to free space before the > file size is increased so extraneous data won't be left in the file?) > Journally works be recording in a special place on the hard drive called > the journal every meta data change that it is about to execute before it > does it, then it updates all the meta data and finally marks the journal > completed. Soft updates are simply a way to order meta data so that it > happens in a safe order. An example is moving file a from directory x to > directory y would first delete file a from dir x, then add it to dir y. > If a crash happens in the middle, then the data becomes lost right? > Part of the reordering of metadata in softupdates involves generating dependency graphs that prevent data loss like this. > Now this shouldn't be a big deal since it's harmless to anything else, > just some free space is eaten up. Since all meta data updates have this > same kind of harmless behavior, that why fsck can be done in the > background now instead of foreground. The theory of softupdates is that whatever metadata made it to disk before shutdown/crash is consistent enough to be trusted after just a quick preen. The rest of the background checking is just to clean up blocks that became unallocated but weren't committed. > > Now comparing the two, perfomance wise journalling has an advantage > since every group of meta data updates that are written to the journal > at the same time can be reordered to optimize the disk performance. The > disk head just has to move across the disk in order instead of seeking > back and forth. Now this performance is usually lost because the > journal is constantly needing to be updated and it probably lies in one > small ares of the disk. The other benefit of the journal is very quick > fsck times since all it has do to it see what the journal was updating > and make sure it all completed. Soft updates still require a full fsck, > but since it can be done in the background unlike journalling, it mean > even faster startup time, but more cpu and i/o time spent on it. Now if > the journal of a journalling fs could be kept somewhere else, say, in > some kind of nvram, then journalling might be overall more efficient as > far as disk i/o and cpu time than soft updates. Performance between softupdates and journalling is still hotly debated, and your statements border on the 'flaimbait' side of the argument. > > I'm mainly just trying to get an understanding of these two techniques, > not neccessarily saying one is better. In the real world, it's probably > very dependent on many other things like lot of random access vs. > sequential, many files and file ops per seconds, vs. mostly read-only > with noatime set, etc. Softupdates really aren't a whole lot different from journalling. Both turn metadata operations into a sequence of ordered atomic updates. The only difference is that journalling writes these updates to the on-disk journal right away and then commits them later on, and softupdates keeps (most of) them in RAM and then commits them later on. You are correct that journalling has a key advantage in that a fsck, either foreground or background, is not strictly required after an unexpected shutdown. For further information, I'd suggest reading: http://www.mckusick.com/softdep/index.html Scott