Date: Thu, 4 Sep 2003 18:54:59 -0400 (EDT) From: Jonathan Lennox <lennox@cs.columbia.edu> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/56461: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd Message-ID: <200309042254.h84MsxdA041659@cnr.cs.columbia.edu> Resent-Message-ID: <200309042300.h84N0OkO054628@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 56461 >Category: kern >Synopsis: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Thu Sep 04 16:00:24 PDT 2003 >Closed-Date: >Last-Modified: >Originator: Jonathan Lennox >Release: FreeBSD 5.1-RELEASE-p2 i386 >Organization: Columbia University Computer Science >Environment: System: FreeBSD cnr.cs.columbia.edu 5.1-RELEASE-p2 FreeBSD 5.1-RELEASE-p2 #0: Wed Aug 27 22:24:11 EDT 2003 lennox@cnr.cs.columbia.edu:/usr/obj/usr/src/sys/CNR i386 >Description: Linux's implementation of NFS NLM locks is buggy: it doesn't support lock cookies longer than 8 bytes in size. See the comment in <http://lxr.linux.no/source/include/linux/lockd/xdr.h?v=2.6.0-test2> on the definition of 'struct nlm_cookie': "NLM cookies. Technically they can be 1K, Nobody uses over 8 bytes however." Unfortunately, this is actually "nobody" except FreeBSD 5.x, which uses 16-byte cookies. As a result, any attempt by a FreeBSD client to lock an NFS-mounted file from a Linux server results in the process on the FreeBSD client hanging, unkillably. Getting this fixed in Linux will probably be difficult -- after all, it doesn't inconvenience *Linux* users. Moreover, since this hasn't been fixed as of Linux 2.6-test, any server-side fix is going to take a *long* time to be reliably deployed. As such, I'm afraid that in order to have successful interoperation with Linux NFS servers, the FreeBSD NFS lock client code needs to be modified to send only 8-byte NLM cookies. The patch I've attached below is a quick-and-dirty fix, as recommended by Dan Nelson on freebsd-hackers on 29 April 2003. However, it loses functionality, since the protection against PID recycling is disabled. A proper fix would be either to somehow compress all three pieces of information -- pid, pid_start, and msg_seq -- into eight bytes (difficult); maintain an in-kernel table mapping an eight-byte sequence number to lockd_msg_ident; or find some other, smaller way of defending against pid recycling. >How-To-Repeat: Make sure rpc.lockd and rpc.statd are running. NFS-mount a filesystem from a Linux fileserver. flock() the file. Observe the flock()ing process hanging. Notice that not even kill -9 will kill the process. >Fix: Apply the following patch, and rebuild rpc.lockd and your kernel. --- nfs_lock.h.orig Thu Sep 4 18:11:45 2003 +++ nfs_lock.h Thu Sep 4 18:12:17 2003 @@ -49,12 +49,10 @@ /* * This structure is used to uniquely identify the process which originated * a particular message to lockd. A sequence number is used to differentiate - * multiple messages from the same process. A process start time is used to - * detect the unlikely, but possible, event of the recycling of a pid. + * multiple messages from the same process. */ struct lockd_msg_ident { pid_t pid; /* The process ID. */ - struct timeval pid_start; /* Start time of process id */ int msg_seq; /* Sequence number of message */ }; --- nfs_lock.c.orig Thu Sep 4 18:11:50 2003 +++ nfs_lock.c Thu Sep 4 18:14:45 2003 @@ -117,7 +117,6 @@ p->p_nlminfo->pid_start = p->p_stats->p_start; timevaladd(&p->p_nlminfo->pid_start, &boottime); } - msg.lm_msg_ident.pid_start = p->p_nlminfo->pid_start; msg.lm_msg_ident.msg_seq = ++(p->p_nlminfo->msg_seq); msg.lm_fl = *fl; @@ -257,8 +256,8 @@ */ if (targetp->p_nlminfo == NULL || ((ansp->la_msg_ident.msg_seq != -1) && - (timevalcmp(&targetp->p_nlminfo->pid_start, - &ansp->la_msg_ident.pid_start, !=) || + (/*timevalcmp(&targetp->p_nlminfo->pid_start, + &ansp->la_msg_ident.pid_start, !=) || */ targetp->p_nlminfo->msg_seq != ansp->la_msg_ident.msg_seq))) { PROC_UNLOCK(targetp); return (EPIPE); >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200309042254.h84MsxdA041659>