Date: Fri, 30 Sep 2011 06:25:25 -0700 From: Kirk McKusick <mckusick@mckusick.com> To: Attilio Rao <attilio@freebsd.org> Cc: Garrett Cooper <yanegomi@gmail.com>, freebsd-fs@freebsd.org, Konstantin Belousov <kib@freebsd.org>, Xin LI <delphij@freebsd.org> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? Message-ID: <201109301325.p8UDPPG2072859@chez.mckusick.com> In-Reply-To: <CAJ-FndBUhzBH3KaaDBvZk3OnNifLd83NedYLPvmLHTL7PTCmCA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> Date: Thu, 29 Sep 2011 18:13:34 +0200 > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > From: Attilio Rao <attilio@freebsd.org> > To: Kirk McKusick <mckusick@mckusick.com>, > Konstantin Belousov <kib@freebsd.org> > Cc: Garrett Cooper <yanegomi@gmail.com>, freebsd-fs@freebsd.org, > Xin LI <delphij@freebsd.org> > > 2011/9/29 Kirk McKusick <mckusick@mckusick.com>: > > > I am definitely not in a rush on this, so by all means take some time > > to look it over. The EBUSY unmount has been in its current state > > for several years, so I am fine with taking a few weeks to sort out > > the correct solution. Indeed, I am glad that Garrett has volunteered > > to do some more serious testing. > > > > If this general approach is not correct, I can put a hook in for just > > UFS so that it can have its historic behavior. As you have noted, the > > SU code has a lot of activity that gets done under the protection of > > vfs_busy. So it may be the only filesystem for which draining the > > vfs_busy lock during unmount is needed. > > Honestly, my first thought was exactly that -- an option that was > forcing the unmount sleep if SU is compiled in the kernel. The above would solve the problem at hand (NanoBSD builds). But I believe that this issue can arise with any filesystem that has background behavior such as ZFS and NFS. It is just most evident with UFS using SU. > When you mention 'historical behaviour' you mean the behaviour UFS had > even prior the introduction of the 'complete' VFS layer or it was the > behaviour unmount(2) was expected to implemented since the beginning? Synchronous unmount has exisited since at least the UNIX versions that I used in the 1970's. > My guess is that recent SU improvement by you and Jeff may have lead > to higher vfs_busy() contention, thus making this behaviour just more > visible. You are correct. SU has a lot of background activity. And as memories have grown bigger, the backlog has been able to get bigger and hence more noticable. On a busy system I have measured over fourty calls to "sync; sleep 1; umount" before the filesystem would finally unmount successfully. > BTW, I'm afraid the forced unmount case may have a possible deadlock. > > thread1 is doing whatever codepath it wants and thread2 is doing > unmount (forced right now): > > thread1::vfs_busy() > thread2::lock coveredvnode > thread1::contests coveredvnode > thread2::sleep because of thread1 vfs_busy I agree that the above deadlock is possible. See suggested solution below. > I think this deadlock was actually possible even with the old code, it > was just a LOR between a vnode lock and mount lock. Yes, this is a long-standing issue. > I'm not sure if there was any invariant I discussed with Kostik in the > past, preventing one way or another I'm forgetting about, but it seems > a possible deadlock to me. > > If you see this issue I'll make a patch for it. > > Attilio Here is my proposed fix. It does the unroll originally found in the non-FORCE case before sleeping waiting for the vfs_busy to clear. Is it acceptable to hold the mount mutex while calling VOP_UNLOCK? If not, then it needs to be released before the unlock, reacquired afterwards, and the check to see if the sleep is needed redone. Index: vfs_mount.c =================================================================== --- vfs_mount.c (revision 225881) +++ vfs_mount.c (working copy) @@ -1187,6 +1187,7 @@ mtx_assert(&Giant, MA_OWNED); +top: if ((coveredvp = mp->mnt_vnodecovered) != NULL) { mnt_gen_r = mp->mnt_gen; VI_LOCK(coveredvp); @@ -1227,21 +1228,19 @@ mp->mnt_kern_flag |= MNTK_UNMOUNTF; error = 0; if (mp->mnt_lockref) { - if ((flags & MNT_FORCE) == 0) { - mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | - MNTK_UNMOUNTF); - if (mp->mnt_kern_flag & MNTK_MWAIT) { - mp->mnt_kern_flag &= ~MNTK_MWAIT; - wakeup(mp); - } - MNT_IUNLOCK(mp); - if (coveredvp) - VOP_UNLOCK(coveredvp, 0); - return (EBUSY); + mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | + MNTK_UNMOUNTF); + if (mp->mnt_kern_flag & MNTK_MWAIT) { + mp->mnt_kern_flag &= ~MNTK_MWAIT; + wakeup(mp); } + if (coveredvp) + VOP_UNLOCK(coveredvp, 0); mp->mnt_kern_flag |= MNTK_DRAINING; error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, "mount drain", 0); + MNT_IUNLOCK(mp); + goto top; } MNT_IUNLOCK(mp); KASSERT(mp->mnt_lockref == 0, Does this seem like the correct fix to you? Kirk McKusick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201109301325.p8UDPPG2072859>