Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Oct 2011 23:38:11 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        Kirk McKusick <mckusick@mckusick.com>, Garrett Cooper <yanegomi@gmail.com>, Xin LI <delphij@freebsd.org>, freebsd-fs@freebsd.org
Subject:   Re: Need to force sync(2) before umounting UFS1 filesystems?
Message-ID:  <20111003203811.GA1511@deviant.kiev.zoral.com.ua>
In-Reply-To: <CAJ-FndBw_PCPYcUoDS4WMnpLd=uwDK4b-y9-vT-qignbeqPaSA@mail.gmail.com>
References:  <CAGH67wSYmcxJCbTMVL%2BqWzbLojiCiBmRF98yaNL4b3d3LbvbYw@mail.gmail.com> <201110012137.p91Lb6FI093841@chez.mckusick.com> <CAJ-FndBw_PCPYcUoDS4WMnpLd=uwDK4b-y9-vT-qignbeqPaSA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--R/ry0oax4LN2sDNq
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Oct 02, 2011 at 02:19:32AM +0200, Attilio Rao wrote:
> I'm sorry if it wasn't clear in kib/my latest message, but we don't
> need the coveredvnode unlocking logic because of the tegge's commit.
>=20
> I just think we should commit the change in policy Kirk initially
> submitted + a comment on top of vfs_busy() explaining why the deadlock
> with coveredvnode cannot happen.

Below is my take on the comment.

commit 3981acdadcf4313dbdf813ec107f7bfbb4057d09
Author: Konstantin Belousov <kostik@pooma.home>
Date:   Mon Oct 3 23:33:06 2011 +0300

    Move parts of the commit log for r166167, where Tor explained the
    interaction between vnode locks and vfs_busy(), into comment.

diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index 7eb619a..3d7735d 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -348,6 +348,38 @@ SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NU=
LL);
 /*
  * Mark a mount point as busy. Used to synchronize access and to delay
  * unmounting. Eventually, mountlist_mtx is not released on failure.
+ *
+ * vfs_busy() is a custom lock, it can block the caller.
+ * vfs_busy() only sleeps if the unmount is active on the mount point.
+ * For a mountpoint mp, vfs_busy-enforced lock is before lock of any
+ * vnode belonging to mp.
+ *
+ * Lookup uses vfs_busy() to traverse mount points.
+ * root fs			var fs
+ * / vnode lock		A	/ vnode lock (/var)		D
+ * /var vnode lock	B	/log vnode lock(/var/log)	E
+ * vfs_busy lock	C	vfs_busy lock			F
+ *
+ * Within each file system, the lock order is C->A->B and F->D->E.
+ *
+ * When traversing across mounts, the system follows that lock order:
+ *
+ *        C->A->B
+ *		|
+ *	        +->F->D->E
+ *
+ * The lookup() process for namei("/var") illustrates the process:
+ *  VOP_LOOKUP() obtains B while A is held
+ *  vfs_busy() obtains a shared lock on F while A and B are held
+ *  vput() releases lock on B
+ *  vput() releases lock on A
+ *  VFS_ROOT() obtains lock on D while shared lock on F is held
+ *  vfs_unbusy() releases shared lock on F
+ *  vn_lock() obtains lock on deadfs vnode vp_crossmp instead of A.
+ *    Attempt to lock A (instead of vp_crossmp) while D is held would
+ *    violate the global order, causing deadlocks.
+ *
+ * dounmount() locks B while F is drained.
  */
 int
 vfs_busy(struct mount *mp, int flags)

--R/ry0oax4LN2sDNq
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)

iEYEARECAAYFAk6KHTMACgkQC3+MBN1Mb4gslQCcCAs5ZP1DSVPTGu0tJZW1TKD8
9UEAnAtN0EbYJBdxVqSSe/Aja41Kqu25
=l+uN
-----END PGP SIGNATURE-----

--R/ry0oax4LN2sDNq--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111003203811.GA1511>