Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Sep 1997 22:09:35 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        current@freebsd.org
Subject:   NFS client locking
Message-ID:  <199709122209.PAA29030@usr08.primenet.com>

next in thread | raw e-mail | index | archive | help
I would like to discuss NFS client locking implementation details.

This is document constitutes a design rationale.



NFS client locking requires the proxying of locks across the network.

I believe NFS client locking needs to assert the lock locally, first,
then if that is successful, assert it against the server, and if that
is successful, return success.

If the lock conflicts with a local process that also has the file
open, the lock will be denied without generating wire traffic.

If the lock conflicts with a process on another machine's lock,
then the remote lock request will be denied.

If the lock is denied by the remote machine, the local lock must
be deasserted.

Deasserting the local lock is fraught with peril.

If the local lock has been coelesced, it may have upgraded or
downgraded the locak during the coelesce.

If the local lock is coelesced, it may have overlapped with other
locks.

If a an overlapping or upgraded or downgraded lock region is removed,
then the previous lock, which was legitimately granted, will be
destroyed.

Therefore, to assert a lock, the client machine must:

	IF local_assert_uncoelesced_lock() == FAIL
		return FAIL
	ELSE
		IF remote_assert_lock() == FAIL
			local_deassert_uncoelesced_lock()
			return FAIL
		ENDIF
		local_coaelesce_lock()
		return SUCCESS
	ENDIF

To deassert a lock, it must:

	local_decoelesce_lock()
	IF remote_deassert_lock() == FAIL
		local_coaelesce_lock()
		return FAIL
	ENDIF
	local_deassert_uncoelesced_lock()
	return SUCCESS

In other words, delayed coelescing and delayed deletion.

In order to implement this, the common locking code must move out
of the per FS VOP_ADVLOCK() and into the system calls/VFS framework
layer.

In order to move the common locking code up, the access to per FS
data structures must be removed.  Specifically, the lock list must
be removed from the inode and paced into the vnode, which is a
filesystem independent opaque object.

In both cases, the common locking code must respect uncoelesced
locks as if they had been coelesced.  It must examine both.

In the case of a lock demotion or promotion, both the uncoelesced
and coelesced locks must be respected, and the higher restriction
enforced; this is equivalent to the conflicting requestor coming
in either before a lock demotion or after a lock promotion, either
of which cases it must be capable of handling anyway.

In order to handle the intial "ELSE" case in the pseudocode, the
remote_assert_lock() (implemented vy the NFS spcific VOP_ADVLOCK()
call), must perform the proxy on behalf of the local system.

For this to work properly, the VOP_ADVLOCK() function must provide
a veto-based interface.  That is, it must support the idea of
returning "request allowed" or "request denied" to the common locking
code.

For NFS, this result will be the server acceptance or refusal of
the proxied lock request.

For all local FS's, this means VOP_ADVLOCK() will simply return "true",
with the exception of multiplexing FS layers.

For multiplexing FS layers (which combine more than one FS), the
multiplexing layer must reconcile failures.

Consider the cause of a union mount of two NFS filesystems.

When the lock request is made to the union FS, the union FS must
make a request for each underlying FS in the union.  This is,
effectively:

	for( fsp = fs1; fsp != NULL; fsp = fsp->next) {
		IF fsp->VOP_ADVLOCK(assert) == FAIL
			for( fsp2 = fs1; fsp2 != fsp; fsp2 = fsp2->next) {
				fsp2->VOP_ADVLOCK(deassert)
			}
			return FAIL
		ENDIF
	}
	return SUCCESS

(Yes, union FS's are not a linked list; this is pseudo-code).


It seems that these changes are required for NFS client locking to
operate.


Further benefits:

While the NFS client locking code is under developement, local clients
locks will be enforced against each other, even if they are not
enforced against each other.

Moving the locking code out of the per FS code mans only one locking
implementation needs to be debugged, instead of one per FS type.

Moving the locking code to a common area means less total code
which means less things to potentially go wrong.

Moving to a veto based interface simplifies the FS code, and the code
necessary to implement an entirely new FS.

The change is in keeping with the spirit of the Heidemann thesis on
which the existing code is modelled.

The code is already implemented, an in the FreeBSD core team's
(Doug Rabson's) posession.


Comments?


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199709122209.PAA29030>