Date: Tue, 13 May 2003 01:28:58 -0700 (PDT) From: "Andrew P. Lentvorski, Jr." <bsder@allcaps.org> To: Robert Watson <rwatson@FreeBSD.org> Cc: current@FreeBSD.org Subject: Re: rpc.lockd spinning; much breakage Message-ID: <Pine.LNX.4.44.0305130104010.31214-100000@mail.allcaps.org> In-Reply-To: <Pine.NEB.3.96L.1030512223339.4858G-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 12 May 2003, Robert Watson wrote: > (3) Sometimes rpc.lockd on 5.x acting as a server gets really confused > when you mix local and remote locks. I haven't quite figured out the > circumstances, but occasionally I run into a situation where a client > contends against an existing lock on the server, and the client never > receives a notification from the server that the lock has been > released. It looks like the server stores state that the lock is > contended, but perhaps never properly re-polls the kernel to see if > the lock has been locally re-released: I just looked at the code again. rpc.lockd does not spawn off extra processes to continuously poll the kernel. It assumes that it has control of the underlying file and only rechecks the blockedlocklist when it receives and grants an NFS file unlock. Consequently, contention on the hardware needs to actually cause a *fail* and not queue up a lock for later. Currently, it returns a fail but still executes add_blockingfilelock. The offending code in lockd_lock.c is: if (retval == PFL_NFSDENIED || retval == PFL_HWDENIED) { /* Once last chance to check the lock */ if (fl->blocking == 1) { /* Queue the lock */ debuglog("BLOCKING LOCK RECEIVED\n"); retval = (retval == PFL_NFSDENIED ? PFL_NFSBLOCKED : PFL_HWBLOCKED); add_blockingfilelock(fl); dump_filelock(fl); } else { A possible fix should be: if (fl->blocking == 1) { if (retval == PFL_NFSDENIED) { /* Queue the lock */ debuglog("BLOCKING LOCK RECEIVED\n"); retval = PFL_NFSBLOCKED; add_blockingfilelock(fl); dump_filelock(fl); } else { /* retval is okay as PFL_HWDENIED */ debuglog("BLOCKING LOCK DENIED IN HARDWARE\n"); dump_filelock(fl); } } else { This should cause the server to return nlm4_denied and the client should eventually retry the lock rather than waiting on the server. CAUTION! I haven't checked or compiled this code. If folks need me to, I can, but it will be a couple of days as I don't have two machines handy that I can install -CURRENT on and set up NFS. -a
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.44.0305130104010.31214-100000>