FreeBSD Mail Archives

Date:      Wed, 15 Jul 1998 21:09:54 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        Matthew.Alton@anheuser-busch.com (Alton, Matthew)
Cc:        FreeBSD-fs@FreeBSD.ORG
Subject:   Re: LFS & soft updates
Message-ID:  <199807152109.OAA19937@usr08.primenet.com>
In-Reply-To: <31B3F0BF1C40D11192A700805FD48BF90177660F@STLABCEXG011> from "Alton, Matthew" at Jul 15, 98 05:30:16 pm

> Just for fun - are lfs and Kirk McKusick's Ganger-Patt soft update
> scheme necessarily mutually exclusive?  I don't know anything
> about soft updates yet.  Just wondered if I should look at the 
> GP papers before designing LFS v.2.

Note: Doing this work would be a good Master's Thesis project;
solving the general problem such that you could stack FS layers
arbitrarily using the same underlying graph code would probably be
worth a PhD Dissertation.


LFS and soft updates are not mutually exclusive, but in order to
optimize the representational geometry (how the structures are laid
out, what they imply, precalculation instead of postcalculation of
some dependency related issues, etc., etc.), the FFS soft dependency
code as implemented is very depndent on the FFS architecture
itself.

It is possible to take a non-traditional view of an FS, and look
at it as a series of events and actors.  Each event/actor relationship
is what the soft updates code represents, and the ordering of the
actors operations as a result of the events is what a dependency
models.

Effectively, the code embodies the result of a Warshall's Algorithm
run on the dependency graph generated from looking at the FS as
events: transitive closure over the dependency graph.


The work Julian, Kirk, and anyone else who has submitted patches
have been engaged in is removing redundant OpenBSD and/or BSDI
dependency nodes, and adding FreeBSD/unified VM dependency nodes.
In addition, they have resolved some problems that are still issues
for OpenBSD and BSDI (hopefully, both will benefit from the efforts).


This work would have been a hell of a lot easier if the FFS and
its interaction with the rest of the kernel and I/O subsystem
had been documented in terms of event/actor relationships.  If
this had been done, one would need only to "diff" the graphs, and
they would be, for all practical purposes, asking "will the code
that needs to be changed please stand up".


I personally would have preferred an implementation that registered
a graph and computed the Warshal's at the time an FS was instanced
(an O(3) algorithm, which is extermely cheap in graph theory terms),
and pre-calculated the Hamiltonian cycles across the graph to allow
people to stick new edges in later with a minimal recalculation cost.


This would have allowed the soft updates mechanism to be extensible,
at the cost of a full linear node traversal -- O(1) -- for each new
nodal relationship (ie: stacking FS layers, export of a transactioning
system to user space, etc.).  This would have cost memory, as Kirk
points out, but I don't think it would have cost as much memory as
he seems to believe it would have cost, since you could still encode
the dependency data as Kirk has done; the penalty would be serializing
dependencies at the VFS/VFS boundry.  The memory costs would be for
inter-VFSOP dependency resoloution, seperate from the integration of
this code into the FFS code within the ops themselves.


If you want soft updates in the LFS code, which might make *some*
sense, in terms of reducing the work for the cleaner, and adding
implicit write gathering, and similar benefits to LFS, then you
should do this:

1)	Model the current FFS code in terms of event/actor
	relationships.  Now that the soft updates code seems
	to be working fairly completely, this should be rather
	easy; just go through the code, and everywhere there
	is an incusrion of the soft udates code, that is an
	event/actor dependency relationship.

2)	Now do the same for LFS.  For the UFS code, which both
	LFS and FFS depend upon, this should be rather trivial,
	since it is already done.  For the LFS code, you will
	have to specifically identify the points in the model
	where each event/actor relationship occurs.

3)	Implement the dependency resoloution code that will be
	executed off of the slotted soft clock that is the syncd
	to resolve each of these relationships; both queuing of
	dependencies when the events occur, and conflicts on
	queued data changes which may occur between the new
	actors and the existing ones.

This is not a trivial undertaking; mostly it is tedious detail
work requiring a lot of academic rigor.  A failure in rigor will
result in a system that fails to operate as expected.  As with
the FFS soft updates code, a single missed dependency will result
in a system that is, in certain circumstances, no better than async
mounts (the degradation case is a system that is as unreliable as
an async mount, and which fails to make the integrity guarantees
that were the reason you expended the effort in the first place).

I don't think I'm exagerating either the difficulty or the value of
the resulting work.  The original Ganger/Patt work was at this level
of difficulty, and Kirk McKusick is definitely no slouch as a
Computer Scientist, and he did not implement the code over night.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199807152109.OAA19937>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation