From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 13:00:52 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B97914CB for ; Fri, 30 Aug 2013 13:00:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 7CEDA22A3 for ; Fri, 30 Aug 2013 13:00:51 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArkEAB6XIFKDaFve/2dsb2JhbABaFoMmUYMnvBmBD4E3dIIkAQEFIwRSGw4KAgINGQJZBi6HZgynDZIWgSmOFzQHgmiBNAOZIpA3gzwggW4 X-IronPort-AV: E=Sophos;i="4.89,990,1367985600"; d="scan'208";a="48260454" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 30 Aug 2013 09:00:50 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 46E6DB3F12; Fri, 30 Aug 2013 09:00:50 -0400 (EDT) Date: Fri, 30 Aug 2013 09:00:50 -0400 (EDT) From: Rick Macklem To: Konstantin Belousov Message-ID: <1251021093.15594833.1377867650267.JavaMail.root@uoguelph.ca> In-Reply-To: <20130830085336.GU4972@kib.kiev.ua> Subject: Re: fixing "umount -f" for the NFS client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 13:00:52 -0000 Kostik wrote: > On Thu, Aug 29, 2013 at 07:43:34PM -0400, Rick Macklem wrote: > > Kostik wrote: > > > On Thu, Aug 29, 2013 at 06:21:41PM -0400, Rick Macklem wrote: > > > > Kostik wrote: > > > > > On Wed, Aug 28, 2013 at 08:15:27PM -0400, Rick Macklem wrote: > > > > > > I've been doing a little more testing of "umount -f" for > > > > > > NFS > > > > > > mounts and they seem to be working unless some other > > > > > > process/thread > > > > > > has busied the file system via vfs_busy(). > > > > > > > > > > > > Unfortunately, it is pretty easy to vfs_busy() the file > > > > > > system > > > > > > by using a command like "df" that is stuck on the > > > > > > unresponsive > > > > > > NFS server. > > > > > > > > > > > > The problem seems to be that dounmount() msleep()s while > > > > > > mnt_lockref != 0 before calling VFS_UNMOUNT(). > > > > > > > > > > > > If some call into the NFS client was done before this > > > > > > while (mp->mnt_lockref) loop with msleep() in it, it > > > > > > can easily kill off RPCs in progress. (It currently > > > > > > does this in nfs_unmount() using the newnfs_nmcancelreqs() > > > > > > call. > > > > > > > > > > > > In summary: > > > > > > - Would it be appropriate to add a new vfs_XXX method that > > > > > > dounmount() would call before the while() loop for the > > > > > > forced dismount case? > > > > > > (The default would be a no-op and I have no idea if any > > > > > > file system other than NFS would have a use for it?) > > > > > > Alternately, there could be a function pointer set > > > > > > non-NULL > > > > > > that would specifically be used by the NFS client for > > > > > > this. > > > > > > This would avoid adding a vfs_XXX() method, but would > > > > > > mean > > > > > > an NFS specific call ends up in the generic dounmount() > > > > > > code. > > > > > > > > > > > > Anyone have comments on this? > > > > > > > > > > > Yes, I do. I agree with adding the pre-unmount vfs method. > > > > > This seems to be the cleanest solution possible. > > > > > > > > > I've attached a patch. It is also at > > > > http://people.freebsd.org/~rmacklem/forced-dism.patch > > > > in case the attachment gets lost. > > > > I don't really like doing the MNT_IUNLOCK(), MNT_ILOCK() > > > > before/after > > > > the VFS_KILLIO() call, but I couldn't see any better way to do > > > > it > > > > and > > > > it looks safe to do so, at least for the forced case. > > > Might be, call it VFS_PURGE() ? > > > > > Sure, any name is fine with me. > > > > > I suggest to move the call to the VFS_KILLIO after the > > > MNTK_DRAINING > > > is > > > set, to avoid getting new references after the current i/o > > > transactions > > > are stopped. You would need to set MNTK_DRAINING unconditionally. > > > Also, > > > it probably makes sense to replace the if (mnt_lockref) with > > > while > > > (). > > > > > Hmm. When I look at the code, the only use of MNTK_DRAINING seems > > to > > be to tell vfs_unbusy() to do a wakeup() if mnt_lockref == 0. I > > don't > > see why setting it before VFS_PURGE() would matter? > > > > Let me explain what the NFS client does: > > If is sees MNTK_UNMOUNTF set, it fails VOP/VFS calls without > > attempting > > any RPCs. That's why I needed MNTK_UNMOUNTF set before the > > VFS_PURGE() > > call. The VFS_PURGE() call causes any RPC that is already in > > progress to > > fail (by closing the connection to the server). If there is a case > > where > > an RPC attempt can get stuck after this point, it's a bug in the > > NFS client > > I will need to find;-) > > --> Once MNTK_UNMOUNTF is set and VFS_PURGE() is called, all > > VOP/VFS ops > > should return failure without attempting to do RPCs against the > > server. > > If some thread does vfs_busy() while VFS_PURGE() is in progress or > > before > > (any time MNT_ILOCK() isn't held) it should end up doing a > > vfs_unbusy() at > > some point without getting stuck trying to do an RPC against the > > server. > > If this happens before dounmount() re-acquires the MNT_ILOCK(), it > > should be ok, > > since mnt_lockref has been decremented. > > If it does this after the dounmount() thread re-acquires > > MNT_ILOCK(), dounmount() > > should be in the msleep() with MNTK_DRAINING set, so it will get > > the wakeup > > once mnt_lockref has decremented to 0. > > > > Setting MNTK_DRAINING sooner would just result in the odd > > unnecessary wakeup(), > > from what I can see? > > > Hm, I mis-remembered the vfs_busy() code. Yes, I agree with you. > > > > > > > > > I assume I would also need to bump __FreeBSD_version (and maybe > > > > VFS_VERSION?). > > > I think you could avoid it. > > > > > Do you mean I don't need to bump __FreeBSD_version or VFS_VERSION > > or both? > I do not see much sense in bumping either of them. > You might want to bump __FreeBSD_version when merging to stable. > Ok, thanks. I'll consider the code reviewed by you unless I here otherwise from you. rick