From owner-freebsd-current@FreeBSD.ORG  Sat Mar 22 11:51:42 2008
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5F3201065670
	for <current@freebsd.org>; Sat, 22 Mar 2008 11:51:42 +0000 (UTC)
	(envelope-from dfr@rabson.org)
Received: from anchor-post-37.mail.demon.net (anchor-post-37.mail.demon.net
	[194.217.242.87])
	by mx1.freebsd.org (Postfix) with ESMTP id 15CDD8FC22
	for <current@freebsd.org>; Sat, 22 Mar 2008 11:51:42 +0000 (UTC)
	(envelope-from dfr@rabson.org)
Received: from router.rabson.org ([80.177.232.241] helo=itchy.rabson.org)
	by anchor-post-37.mail.demon.net with esmtp (Exim 4.68)
	id 1Jd2GC-0001S9-Ow
	for current@freebsd.org; Sat, 22 Mar 2008 11:51:40 +0000
Received: from macbook.rabson.org (macbook.rabson.org
	[IPv6:2002:50b1:e8f2:1:21e:52ff:fe73:8011])
	by itchy.rabson.org (Postfix) with ESMTP id F20713FB0;
	Sat, 22 Mar 2008 11:51:39 +0000 (GMT)
Message-Id: <E392F264-BCBC-4752-8E99-91ADA2228EBF@rabson.org>
From: Doug Rabson <dfr@rabson.org>
To: Doug Rabson <dfr@rabson.org>
In-Reply-To: <E7E41AFB-FB94-47C1-9169-71992D70C320@rabson.org>
Content-Type: multipart/signed; boundary=Apple-Mail-106-246233165; micalg=sha1;
	protocol="application/pkcs7-signature"
Mime-Version: 1.0 (Apple Message framework v919.2)
Date: Sat, 22 Mar 2008 11:51:39 +0000
References: <E7E41AFB-FB94-47C1-9169-71992D70C320@rabson.org>
X-Mailer: Apple Mail (2.919.2)
X-Virus-Scanned: ClamAV 0.92/6328/Sat Mar 22 07:49:51 2008 on itchy.rabson.org
X-Virus-Status: Clean
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: current@freebsd.org
Subject: Re: CFR: New NFS Lock Manager
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Mar 2008 11:51:42 -0000


--Apple-Mail-106-246233165
Content-Type: text/plain;
	charset=US-ASCII;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: 7bit

I've just uploaded a new patch at http://people.freebsd.org/~dfr/lockd-RC2-22032008.diff 
. This fixes a serious problem on kernels not compiled with the  
LOCKF_DEBUG option (I misplaced a #endif). It also includes minor  
fixes to support 64bit architectures and RELENG_7 (the patch does not  
apply cleanly to RELENG_7 but does work when you work around the patch  
rejects manually).


On 21 Mar 2008, at 10:27, Doug Rabson wrote:

> As I mentioned previously, I have been working on a brand new NFS  
> Lock Manager which runs in kernel mode and uses the normal local  
> locking infrastructure for its state. I'm currently trying to tie up  
> the last few loose ends before committing this work to current. You  
> can find a snapshot of this code at http://people.freebsd.org/~dfr/lockd-RC1-20032008.diff 
> .
>
> To try it out, take a recent current (I last merged with current on  
> 20th March) and apply the patch. Build a kernel with the NFSLOCKD  
> option and add '-k' to 'rpc_lockd_flags' in rc.conf. You will need  
> to build and install at least a new libc and rpc.lockd.
>
> At this point, it would be useful to get some extra eyes to look  
> over my changes. In particular the following:
>
> 1. Choice of syscall number - I found one spare next to the NFS  
> syscall and took that. The new syscall is listed in the FBSD_1.1  
> namespace, possibly it should be somewhere else.
>
> 2. ABI compatibility - I extended the flock structure by one member  
> (adding l_sysid). I have added new operations to fcntl to support  
> the new extended structure, leaving the old operations in place to  
> work on the old structure. The kernel translates old to new and vice  
> versa. No attempt is made to allow a new userland to work with an  
> old kernel.
>
> 3. The local lock manager has had a complete rewrite to support  
> required features. The new local lock manager supports a more  
> flexible model of lock ownership (which can support remote lock  
> owners). I have replaced the inadequate deadlock detection code with  
> a new (and fast) graph based system. Using the deadlock graph, I was  
> able to avoid the 'thundering herd' issues the old lock code had  
> when many processes were contending for the same locked region.  
> Given the extent of the changes, wider testing and review would be  
> extremely welcome.
>
> 4. The NFS lock manager itself is brand new code and as such ought  
> to be reviewed. I have also ported the userland sunrpc code to run  
> in the kernel environment which may prove useful in future.
>
> Highlights include:
>
> * Thread-safe kernel RPC client - many threads can use the same RPC  
> client handle safely with replies being de-multiplexed at the socket  
> upcall (typically driven directly by the NIC interrupt) and handed  
> off to whichever thread matches the reply. For UDP sockets, many RPC  
> clients can share the same socket. This allows the use of a single  
> privileged UDP port number to talk to an arbitrary number of remote  
> hosts.
>
> * Single-threaded kernel RPC server. Adding support for multi- 
> threaded server would be relatively straightforward and would follow  
> approximately the Solaris KPI. A single thread should be sufficient  
> for the NLM since it should rarely block in normal operation.
>
> * Kernel mode NLM server supporting cancel requests and granted  
> callbacks. I've tested the NLM server reasonably extensively - it  
> passes both my own tests and the NFS Connectathon locking tests  
> running on Solaris, Mac OS X and Ubuntu Linux.
>
> * Userland NLM client supported. While the NLM server doesn't have  
> support for the local NFS client's locking needs, it does have to  
> field async replies and granted callbacks from remote NLMs that the  
> local client has contacted. We relay these replies to the userland  
> rpc.lockd over a local domain RPC socket.
>
> * IPv6 should be supported but has not been tested since I've been  
> unable to get IPv6 to work properly with the Parallels virtual  
> machines that I've been using for development.
>
> * Robust deadlock detection for the local lock manager. In  
> particular it will detect deadlocks caused by a lock request that  
> covers more than one blocking request. As required by the NLM  
> protocol, all deadlock detection happens synchronously - a user is  
> guaranteed that if a lock request isn't rejected immediately, the  
> lock will eventually be granted. The old system allowed for a  
> 'deferred deadlock' condition where a blocked lock request could  
> wake up and find that some other deadlock-causing lock owner had  
> beaten them to the lock.
>
> * Since both local and remote locks are managed by the same kernel  
> locking code, local and remote processes can safely use file locks  
> for mutual exclusion. Local processes have no fairness advantage  
> compared to remote processes when contending to lock a region that  
> has just been unlocked - the local lock manager enforces a strict  
> first-come first-served model for both local and remote lockers.


--Apple-Mail-106-246233165--