Date: Mon, 3 Oct 2011 23:38:11 +0300 From: Kostik Belousov <kostikbel@gmail.com> To: Attilio Rao <attilio@freebsd.org> Cc: Kirk McKusick <mckusick@mckusick.com>, Garrett Cooper <yanegomi@gmail.com>, Xin LI <delphij@freebsd.org>, freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? Message-ID: <20111003203811.GA1511@deviant.kiev.zoral.com.ua> In-Reply-To: <CAJ-FndBw_PCPYcUoDS4WMnpLd=uwDK4b-y9-vT-qignbeqPaSA@mail.gmail.com> References: <CAGH67wSYmcxJCbTMVL%2BqWzbLojiCiBmRF98yaNL4b3d3LbvbYw@mail.gmail.com> <201110012137.p91Lb6FI093841@chez.mckusick.com> <CAJ-FndBw_PCPYcUoDS4WMnpLd=uwDK4b-y9-vT-qignbeqPaSA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
On Sun, Oct 02, 2011 at 02:19:32AM +0200, Attilio Rao wrote:
> I'm sorry if it wasn't clear in kib/my latest message, but we don't
> need the coveredvnode unlocking logic because of the tegge's commit.
>
> I just think we should commit the change in policy Kirk initially
> submitted + a comment on top of vfs_busy() explaining why the deadlock
> with coveredvnode cannot happen.
Below is my take on the comment.
commit 3981acdadcf4313dbdf813ec107f7bfbb4057d09
Author: Konstantin Belousov <kostik@pooma.home>
Date: Mon Oct 3 23:33:06 2011 +0300
Move parts of the commit log for r166167, where Tor explained the
interaction between vnode locks and vfs_busy(), into comment.
diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index 7eb619a..3d7735d 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -348,6 +348,38 @@ SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NULL);
/*
* Mark a mount point as busy. Used to synchronize access and to delay
* unmounting. Eventually, mountlist_mtx is not released on failure.
+ *
+ * vfs_busy() is a custom lock, it can block the caller.
+ * vfs_busy() only sleeps if the unmount is active on the mount point.
+ * For a mountpoint mp, vfs_busy-enforced lock is before lock of any
+ * vnode belonging to mp.
+ *
+ * Lookup uses vfs_busy() to traverse mount points.
+ * root fs var fs
+ * / vnode lock A / vnode lock (/var) D
+ * /var vnode lock B /log vnode lock(/var/log) E
+ * vfs_busy lock C vfs_busy lock F
+ *
+ * Within each file system, the lock order is C->A->B and F->D->E.
+ *
+ * When traversing across mounts, the system follows that lock order:
+ *
+ * C->A->B
+ * |
+ * +->F->D->E
+ *
+ * The lookup() process for namei("/var") illustrates the process:
+ * VOP_LOOKUP() obtains B while A is held
+ * vfs_busy() obtains a shared lock on F while A and B are held
+ * vput() releases lock on B
+ * vput() releases lock on A
+ * VFS_ROOT() obtains lock on D while shared lock on F is held
+ * vfs_unbusy() releases shared lock on F
+ * vn_lock() obtains lock on deadfs vnode vp_crossmp instead of A.
+ * Attempt to lock A (instead of vp_crossmp) while D is held would
+ * violate the global order, causing deadlocks.
+ *
+ * dounmount() locks B while F is drained.
*/
int
vfs_busy(struct mount *mp, int flags)
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)
iEYEARECAAYFAk6KHTMACgkQC3+MBN1Mb4gslQCcCAs5ZP1DSVPTGu0tJZW1TKD8
9UEAnAtN0EbYJBdxVqSSe/Aja41Kqu25
=l+uN
-----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111003203811.GA1511>
