From owner-freebsd-current Fri Oct 2 14:24:34 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id OAA07801 for freebsd-current-outgoing; Fri, 2 Oct 1998 14:24:34 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from alpo.whistle.com (alpo.whistle.com [207.76.204.38]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id OAA07774 for ; Fri, 2 Oct 1998 14:24:19 -0700 (PDT) (envelope-from julian@whistle.com) Received: (from daemon@localhost) by alpo.whistle.com (8.8.5/8.8.5) id OAA08084; Fri, 2 Oct 1998 14:22:09 -0700 (PDT) Received: from current1.whistle.com(207.76.205.22) via SMTP by alpo.whistle.com, id smtpdmX8070; Fri Oct 2 21:22:08 1998 Date: Fri, 2 Oct 1998 14:22:01 -0700 (PDT) From: Julian Elischer To: "Justin T. Gibbs" cc: Don Lewis , current@FreeBSD.ORG Subject: Re: Softupdates, filesystem safety and SCSI disk write caching In-Reply-To: <199810022026.OAA17722@pluto.plutotech.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG To answer your statement with a tutorial, (not because you need it but because there has been a need for a quick tutorial on soft updates in the mail archives somewhere). No this is not what softupdates expects.. It expects that data blocks are written before metadata blocks. (for writing files) An example of some of the dependencies for writing a file B are: directory blocks depend on inode blocks, which depend on indirect blocks, which depend on datablocks This is a gross simplification (ignoring bitmaps and superblocks etc.) Note that directory blocks are data blocks so dependencies in hierarchies of newly written files go throug this cycle several times. The exception to this being that you could write all the directory blocks ahead of time if they are not yet pointed to by anything, but you need to link them up in the correct order when assebling the hierarchy. You must never write a pointer in the filesystem that points to something that has not yet been written. To achieve this, softupdates sometimes writes synthetic data (made up on the spot, to keep the FS consistent). For example, it may write a directory block with only some of the created entries in it because the others point to inodes that have not yet been written to disk. In this case it will re-write the block when the other entries are deemed 'safe' (their dependencies have been satisfied). It may also write a block of inodes with some of the inodes invalidated, even though the user is using those files, because the data blocks that the file refers to have not yet been written to disk. When (some of) the data blocks have been written, and it is time again to look at that block of inodes, the inodes that were invalidated before, might be written out valid, with the data pointers updated to reflect those blocks (and ONLY those blocks) that have been written to disk. If you extend a file, and the last few blocks of your extension have not been written yet, you may see the inode being written back with a smaller size than you would see if you did a 'stat' on the file. The way it works however, is that if you do not do an fsync(), the rewrites will not be needed, because the data blocks are scheduled to be written in N seconds (usually 5) and the inode blocks in N+M (usually 10) and the directory entries in N+M+X (usually 15). For deletes the dependencies are approximatly (but not exactly) reversed. If you do a delete before a write has been fully committed to media, the remaining dependencies are cancelled, and never happen, and writes that happenned are reversed in memory, and appropriate requests and dependencies queueud for on-media reversal. On Fri, 2 Oct 1998, Justin T. Gibbs wrote: > >Of course.. All I'm saying is that It is a prerequisite that we should > >document somewhere, that "Softupdates assumes that completion signals the > >arrival of the data into STABLE storage, e.g. magnetic recording." > > If we tagged meta-data buffers appropriately, we could specificly inhibit > write-caching those transactions. I would expect this to give you the > semantics soft-updates expects while still allowing the disk to write cache > data blocks. So, if you have a power failure, the file-system meta-data > would be consistent, but some files might have stale data blocks. This is not acceptable to softupdates. (as defined by McKusick, Ganger and Pratt) > > -- > Justin > > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message