From owner-freebsd-fs@FreeBSD.ORG Mon Oct 3 20:38:17 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9F1F106566B; Mon, 3 Oct 2011 20:38:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 853808FC1F; Mon, 3 Oct 2011 20:38:16 +0000 (UTC) Received: from alf.home (alf.kiev.zoral.com.ua [10.1.1.177]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p93KcCkx010413 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 3 Oct 2011 23:38:12 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from alf.home (kostik@localhost [127.0.0.1]) by alf.home (8.14.5/8.14.5) with ESMTP id p93KcCFA075383; Mon, 3 Oct 2011 23:38:12 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by alf.home (8.14.5/8.14.5/Submit) id p93KcBC8075382; Mon, 3 Oct 2011 23:38:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: alf.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 3 Oct 2011 23:38:11 +0300 From: Kostik Belousov To: Attilio Rao Message-ID: <20111003203811.GA1511@deviant.kiev.zoral.com.ua> References: <201110012137.p91Lb6FI093841@chez.mckusick.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="R/ry0oax4LN2sDNq" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Kirk McKusick , Garrett Cooper , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Oct 2011 20:38:17 -0000 --R/ry0oax4LN2sDNq Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Oct 02, 2011 at 02:19:32AM +0200, Attilio Rao wrote: > I'm sorry if it wasn't clear in kib/my latest message, but we don't > need the coveredvnode unlocking logic because of the tegge's commit. >=20 > I just think we should commit the change in policy Kirk initially > submitted + a comment on top of vfs_busy() explaining why the deadlock > with coveredvnode cannot happen. Below is my take on the comment. commit 3981acdadcf4313dbdf813ec107f7bfbb4057d09 Author: Konstantin Belousov Date: Mon Oct 3 23:33:06 2011 +0300 Move parts of the commit log for r166167, where Tor explained the interaction between vnode locks and vfs_busy(), into comment. diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c index 7eb619a..3d7735d 100644 --- a/sys/kern/vfs_subr.c +++ b/sys/kern/vfs_subr.c @@ -348,6 +348,38 @@ SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NU= LL); /* * Mark a mount point as busy. Used to synchronize access and to delay * unmounting. Eventually, mountlist_mtx is not released on failure. + * + * vfs_busy() is a custom lock, it can block the caller. + * vfs_busy() only sleeps if the unmount is active on the mount point. + * For a mountpoint mp, vfs_busy-enforced lock is before lock of any + * vnode belonging to mp. + * + * Lookup uses vfs_busy() to traverse mount points. + * root fs var fs + * / vnode lock A / vnode lock (/var) D + * /var vnode lock B /log vnode lock(/var/log) E + * vfs_busy lock C vfs_busy lock F + * + * Within each file system, the lock order is C->A->B and F->D->E. + * + * When traversing across mounts, the system follows that lock order: + * + * C->A->B + * | + * +->F->D->E + * + * The lookup() process for namei("/var") illustrates the process: + * VOP_LOOKUP() obtains B while A is held + * vfs_busy() obtains a shared lock on F while A and B are held + * vput() releases lock on B + * vput() releases lock on A + * VFS_ROOT() obtains lock on D while shared lock on F is held + * vfs_unbusy() releases shared lock on F + * vn_lock() obtains lock on deadfs vnode vp_crossmp instead of A. + * Attempt to lock A (instead of vp_crossmp) while D is held would + * violate the global order, causing deadlocks. + * + * dounmount() locks B while F is drained. */ int vfs_busy(struct mount *mp, int flags) --R/ry0oax4LN2sDNq Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk6KHTMACgkQC3+MBN1Mb4gslQCcCAs5ZP1DSVPTGu0tJZW1TKD8 9UEAnAtN0EbYJBdxVqSSe/Aja41Kqu25 =l+uN -----END PGP SIGNATURE----- --R/ry0oax4LN2sDNq--