Date: Fri, 21 Mar 2008 10:27:08 +0000 From: Doug Rabson <dfr@rabson.org> To: current@freebsd.org Subject: CFR: New NFS Lock Manager Message-ID: <E7E41AFB-FB94-47C1-9169-71992D70C320@rabson.org>
next in thread | raw e-mail | index | archive | help
--Apple-Mail-70-154762100 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit As I mentioned previously, I have been working on a brand new NFS Lock Manager which runs in kernel mode and uses the normal local locking infrastructure for its state. I'm currently trying to tie up the last few loose ends before committing this work to current. You can find a snapshot of this code at http://people.freebsd.org/~dfr/lockd-RC1-20032008.diff . To try it out, take a recent current (I last merged with current on 20th March) and apply the patch. Build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. You will need to build and install at least a new libc and rpc.lockd. At this point, it would be useful to get some extra eyes to look over my changes. In particular the following: 1. Choice of syscall number - I found one spare next to the NFS syscall and took that. The new syscall is listed in the FBSD_1.1 namespace, possibly it should be somewhere else. 2. ABI compatibility - I extended the flock structure by one member (adding l_sysid). I have added new operations to fcntl to support the new extended structure, leaving the old operations in place to work on the old structure. The kernel translates old to new and vice versa. No attempt is made to allow a new userland to work with an old kernel. 3. The local lock manager has had a complete rewrite to support required features. The new local lock manager supports a more flexible model of lock ownership (which can support remote lock owners). I have replaced the inadequate deadlock detection code with a new (and fast) graph based system. Using the deadlock graph, I was able to avoid the 'thundering herd' issues the old lock code had when many processes were contending for the same locked region. Given the extent of the changes, wider testing and review would be extremely welcome. 4. The NFS lock manager itself is brand new code and as such ought to be reviewed. I have also ported the userland sunrpc code to run in the kernel environment which may prove useful in future. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * IPv6 should be supported but has not been tested since I've been unable to get IPv6 to work properly with the Parallels virtual machines that I've been using for development. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. --Apple-Mail-70-154762100--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E7E41AFB-FB94-47C1-9169-71992D70C320>