Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Oct 1998 14:22:01 -0700 (PDT)
From:      Julian Elischer <julian@whistle.com>
To:        "Justin T. Gibbs" <gibbs@plutotech.com>
Cc:        Don Lewis <Don.Lewis@tsc.tdk.com>, current@FreeBSD.ORG
Subject:   Re: Softupdates, filesystem safety and SCSI disk write caching 
Message-ID:  <Pine.BSF.3.95.981002135812.15828E-100000@current1.whistle.com>
In-Reply-To: <199810022026.OAA17722@pluto.plutotech.com>

next in thread | previous in thread | raw e-mail | index | archive | help
To answer your statement with a tutorial,
(not because you need it but because there has been a need for a quick
tutorial on soft updates in the mail archives somewhere).

No this is not
what softupdates expects..

It expects that data blocks are written before metadata blocks.
(for writing files)

An example of some of the dependencies for writing a file B are:

directory blocks depend on
inode blocks, which depend on
indirect blocks, which depend on
datablocks

This is a gross simplification (ignoring bitmaps and superblocks etc.)

Note that directory blocks are data blocks so dependencies in hierarchies
of newly written files go throug this cycle several times. The exception
to this being that you could write all the directory blocks ahead of time
if they are not yet pointed to by anything, but you need to link them up
in the correct order when assebling the hierarchy.

You must never write a pointer in the filesystem that points to something
that has not yet been written. To achieve this, softupdates sometimes
writes synthetic data (made up on the spot, to keep the FS consistent).
For example, it may write a directory block with only some of the created
entries in it because the others point to inodes that have not yet been
written to disk. In this case it will re-write the block when the other
entries are deemed 'safe' (their dependencies have been satisfied).  It
may also write a block of inodes with some of the inodes invalidated, even
though the user is using those files, because the data blocks that the
file refers to have not yet been written to disk. When (some of) the data
blocks have been written, and it is time again to look at that block of
inodes, the inodes that were invalidated before, might be written out
valid, with the data pointers updated to reflect those blocks (and ONLY
those blocks) that have been written to disk. If you extend a file, and
the last few blocks of your extension have not been written yet,
you may see the inode being written back with a smaller size than
you would see if you did a 'stat' on the file. The way it works however,
is that if you do not do an fsync(), the rewrites will not be needed,
because the data blocks are scheduled to be written in N seconds (usually
5) and the inode blocks in N+M (usually 10) and the directory entries in 
N+M+X (usually 15). 

For deletes the dependencies are approximatly (but not exactly) reversed.

If you do a delete before a write has been fully committed to media, the
remaining dependencies are cancelled, and never happen, and writes that
happenned are reversed in memory, and appropriate requests and
dependencies queueud for on-media reversal.

 On Fri, 2 Oct 1998, Justin T. Gibbs wrote: 

> >Of course..  All I'm saying is that It is a prerequisite that we should
> >document somewhere, that "Softupdates assumes that completion signals the
> >arrival of the data into STABLE storage, e.g. magnetic recording."
> 
> If we tagged meta-data buffers appropriately, we could specificly inhibit
> write-caching those transactions.  I would expect this to give you the
> semantics soft-updates expects while still allowing the disk to write cache
> data blocks.  So, if you have a power failure, the file-system meta-data
> would be consistent, but some files might have stale data blocks.

This is not acceptable to softupdates. (as defined by McKusick, Ganger and
Pratt)

> 
> --
> Justin
> 
> 
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.95.981002135812.15828E-100000>