From owner-freebsd-fs Thu Sep 21 12:58:59 2000 Delivered-To: freebsd-fs@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id 8A3BA37B43C for ; Thu, 21 Sep 2000 12:58:52 -0700 (PDT) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id e8LJwJB21017; Thu, 21 Sep 2000 12:58:19 -0700 (PDT) Date: Thu, 21 Sep 2000 12:58:19 -0700 From: Alfred Perlstein To: Christopher Stein Cc: Craig A Soules , freebsd-fs Subject: Re: Journaling Filesystems in bsd? (LFS, anyone?) Message-ID: <20000921125819.X9141@fw.wintelcom.net> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.4i In-Reply-To: ; from stein@eecs.harvard.edu on Thu, Sep 21, 2000 at 11:49:23AM -0400 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org * Christopher Stein [000921 12:35] wrote: > > On Thu, 21 Sep 2000, Craig A Soules wrote: > > > Excerpts from internet.computing.freebsd.fs: 21-Sep-100 Re: Journaling > > > Log-structured file systems offer different semantics than > > > synchronous journaling file systems. Synchronous journaling can > > > offer the traditional durability of create. Nothing is durable > > > > Wouldn't it be possible to offer the same semantics as FFS in an LFS > > implementation if the segment was (over)written after each operation? > > Partial segment writes? > A partial segment write solution as was done in de Jonge & Kaashoek's > logical disk. This would solve the internal fragmentation problem and > make the cleaner's life easier, while allowing the system to > provide traditional UFS create semantics. > > However, forgetting about the cleaner for a moment, I think performance > would be just about the same as the application doing an explicit > fsync() to force the full segment. As you said, write times are not > dominated by bandwidth so 64KB and 8KB disk writes are probably pretty > close. If we write just a portion of the segment the cost will be similar > to writing the full segment. So fsyncing full (with lots of internal free > space) segments and partial segment writes will be basically the same -- > with the important difference being on-disk internal fragmentation. > > Now bringing the cleaner back into the picture (as it always should > be).. the higher level of on-disk fragmentation would drop into run-time > performance. The cleaner would be more busy copying and packing segments - > generally consuming resources and getting in the way. > So I agree that partial segment writes make sense. For the reason that > it can offer durability without internal fragmentation - making the > cleaner's life easier. One trick that can be done is to detect high fsync traffic and rewrite the blocks several times. most simplistic case: application creates a file and writes to the first block and then fsync() the log is then sync'd to backing store application appends another block and fsyncs again the log is then sync'd to backing store Right there is a major fragmentation problem. Now consider what you can do for this case: application creates a file and writes to the first block and then fsync() the log is then sync'd to backing store application appends another block and fsyncs again the log is then sync'd to backing store along with the first block. Although the log grows much faster, the contents of the previous segment most likely can be discarded rather than needing compaction plus you avoid fragmentation at the expense of additional (but non-seek requiring) data transfer. Another option is to just rewrite the entire previously written partial segment and informing the cleaner that the previous is junk. This is sort of like adapting ffs_doreallocblks (sp?) to LFS and would most likely be a gain, especially if it only happens when the previous data is still cached (ffs_doreallocblks can make an IO happen if I recall what Kirk explained to me correctly). -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk." To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message