Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Jun 1999 17:42:06 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        eivind@FreeBSD.ORG (Eivind Eklund)
Cc:        jepace@pobox.com, freebsd-fs@FreeBSD.ORG
Subject:   Re: nullfs?
Message-ID:  <199906181742.KAA13187@usr01.primenet.com>
In-Reply-To: <19990618190138.F62123@bitbox.follo.net> from "Eivind Eklund" at Jun 18, 99 07:01:38 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> > The man page for mount_null(8) warns me:
> > 
> > THIS FILESYSTEM TYPE IS NOT YET FULLY SUPPORTED (READ: IT DOESN'T
> > WORK) AND USING IT MAY, IN FACT, DESTROY DATA ON YOUR SYSTEM.  USE AT
> > YOUR OWN RISK.  BEWARE OF DOG.  SLIPPERY WHEN WET.
> > 
> > What are the known problems with it?
> 
> Incorrect alias handling (read: your data may or may not reach disk)

These are vnode backing object issues.  The "vn" device driver
addresses these issues rather trivially (and successfully) by
implementing copy on write coherency semantics.

The problem boils down to wanting the vnode to be an alias for
a vm_object_t, the existance of non-null default vfsops, and
the lack of a VOP_TERMINAL operation for getting the terminal
backing object that has the vm_object_t that's actually physically
backing the data.

Several suggestions have been put forth by people wanting to
optimize the MFS case at the expense of everything else (e.g.
compression and cyrptographic layers which must implement
explicit cache coherency and therefore can not use an alias)
by creating explicit aliases using a seperate alias object
to reference the actual vm_object_t.  There are a number of
good reasons (including the cited examples, above) why this
will break some existing modules written by John Heidemann's
students and others, and will preclude a number of future
research directions that some of us believe are highly
desirable.


> and extreme locking problems, mostly.

He means both VOP_LOCK and VOP_ADVLOCK.  The VOP_LOCK issue has
to do with the fact that VFS's do not own their own vnode free
pools (impossible except for TFS style hacks, due to the lack of
a per VFS VFS_VRELE mechanism).

The VOP_ADVLOCK issue has to do with the fact that advisory locks
are coelesced immediately after assertion, such that if you had
a stack that exported an object that looked like:

	,-----------------------------.
	|           my file           |
	,-----------------------------.
	|    vfs #1     |   vfs #2    |
	`---------------'-------------'

An advisory lock that was asserted to span the boundaries would
have to be addreted in both underlying objects.  If the lock failed
in the second underlying object, then the lock in the first
underlying object would have to be deasserted.  Because of the
coelesce in the first object, deassertion could incorrectly
remove locks (the coelesce itself could incorrectly promote or
demote a partial region form the start to the spanning boundary).


A secondary problem with the VOP_ADVLOCK that is more germane to
the NULLFS implementation is that locks ar hung off the inode
object, instead of the terminal vnode object.  This means that
a lock you assert will not necessarily prevent access if the VFS
you are mounting under the NULLFS is exposed elsewhere in the
filesystem namespace (e.g. I mount "/usr", and then I NULLFS it
to "/home/bob/usr").


A tertiary problem with the NULLFS in this case is reciprocity
in the directory entry lookup traversal; specifically, since
the implementation is not via coroutines, instead of a ladder-lock
for absolute paths, there is a stack lock, which can cause a lock
to root.  In this case, under some circumstances you would see
"locking against myself" panics from mounting "/usr" and then
NULLFS'ing it to "/usr/foo/fee".


And don't even get me started on propagation of POSIX namespace
escapes like "//machine-name/" to subsequent path components on
a per component basis...


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199906181742.KAA13187>