From owner-svn-src-all@freebsd.org Sun Jun 19 18:32:37 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 28C6CA7AC21; Sun, 19 Jun 2016 18:32:37 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D44591633; Sun, 19 Jun 2016 18:32:36 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u5JIWaSI011044; Sun, 19 Jun 2016 18:32:36 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u5JIWaNa011043; Sun, 19 Jun 2016 18:32:36 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201606191832.u5JIWaNa011043@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Sun, 19 Jun 2016 18:32:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r302020 - head/sys/nlm X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Jun 2016 18:32:37 -0000 Author: kib Date: Sun Jun 19 18:32:35 2016 New Revision: 302020 URL: https://svnweb.freebsd.org/changeset/base/302020 Log: Remote and local adv lock servers might de-synchronize (the added comment explains the plausible scenario), resulting in EDEADLK returned on the local registration attempt. Handle this by re-trying the local op [1]. On unmount, local registration abort is indicated as EINTR, abort the nlm call as well. Reported and tested by: pho Suggested and reviewed by: dfr (previous version, [1]) Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (delphij) Modified: head/sys/nlm/nlm_advlock.c Modified: head/sys/nlm/nlm_advlock.c ============================================================================== --- head/sys/nlm/nlm_advlock.c Sun Jun 19 18:29:43 2016 (r302019) +++ head/sys/nlm/nlm_advlock.c Sun Jun 19 18:32:35 2016 (r302020) @@ -713,7 +713,37 @@ nlm_record_lock(struct vnode *vp, int op newfl.l_pid = svid; newfl.l_sysid = NLM_SYSID_CLIENT | sysid; - error = lf_advlockasync(&a, &vp->v_lockf, size); + for (;;) { + error = lf_advlockasync(&a, &vp->v_lockf, size); + if (error == EDEADLK) { + /* + * Locks are associated with the processes and + * not with threads. Suppose we have two + * threads A1 A2 in one process, A1 locked + * file f1, A2 is locking file f2, and A1 is + * unlocking f1. Then remote server may + * already unlocked f1, while local still not + * yet scheduled A1 to make the call to local + * advlock manager. The process B owns lock on + * f2 and issued the lock on f1. Remote would + * grant B the request on f1, but local would + * return EDEADLK. + */ + pause("nlmdlk", 1); + /* XXXKIB allow suspend */ + } else if (error == EINTR) { + /* + * lf_purgelocks() might wake up the lock + * waiter and removed our lock graph edges. + * There is no sense in re-trying recording + * the lock to the local manager after + * reclaim. + */ + error = 0; + break; + } else + break; + } KASSERT(error == 0 || error == ENOENT, ("Failed to register NFS lock locally - error=%d", error)); }