From owner-freebsd-fs@freebsd.org Sat Mar 12 03:44:16 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DD634ACD25C for ; Sat, 12 Mar 2016 03:44:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 89017C7C for ; Sat, 12 Mar 2016 03:44:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u2C3iB4g075678 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 12 Mar 2016 05:44:11 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u2C3iB4g075678 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u2C3iA9t075677; Sat, 12 Mar 2016 05:44:10 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 12 Mar 2016 05:44:10 +0200 From: Konstantin Belousov To: Chris Torek Cc: freebsd-fs@freebsd.org Subject: Re: quotactl bug: vfs_busy never unbusy-es Message-ID: <20160312034410.GF1741@kib.kiev.ua> References: <20160310162916.GB1741@kib.kiev.ua> <201603112315.u2BNFsc0059323@elf.torek.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201603112315.u2BNFsc0059323@elf.torek.net> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2016 03:44:17 -0000 On Fri, Mar 11, 2016 at 03:15:54PM -0800, Chris Torek wrote: > >Lets fix nullfs too. > > And unionfs? :-) > > >The uncomfortable issue with the interface is that > >it is not always possible to busy mount point after unbusy. > > Yes. > > >I propose to assume that for ENOENT error the busy ref is lost > >(this is the only possible error from vfs_busy()). > > This seems pretty scary. Also: > > >I also verified that UFS quota code > >does not return this error for !quotaon case. > > OK, but it certainly can for the quotaon case (though we're > protected by the fact that quotaon already had to do the unbusy > so it doesn't matter now). Anyway, we are getting kind of > complicated here. > > I think we can fix this with a fairly tiny KPI change. Note that > the code currently only works for UFS. Suppose we move the vfs_busy > to happen later (with potential failure, i.e., quotactl() call may > now return ENOENT because it can't busy the file system because > it's being unmounted -- this seems pretty harmless as it only > occurs in a race between quotactl() and unmount, and the caller > could have lost that race anyway). > > I'm not 100% sure this is correct, but it seems a bit less > scary to me :-) But note that it is COMPLETELY untested (not > even compiled), I'm just seeing if you think this is a reasonable > approach. Basically, we're just moving the vfs_busy/unbusy into > UFS itself, and permitting it to fail at that point (for all > ops, not just Q_QUOTAON). Yes, if it work out to pass only mnt_ref-referenced mount point down to the VFS_QUOTACTL method, I am fine with it. Also, I do not see why would it not work, in the sense that the failure modes due to parallel unmount are exactly the same for current code and what you propose. Please finish this. > > (Later we can refactor all this to be usable from, e.g., tmpfs.) > > Chris > > diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c > index 11813fc..e1ab670 100644 > --- a/sys/kern/vfs_syscalls.c > +++ b/sys/kern/vfs_syscalls.c > @@ -173,6 +173,11 @@ sys_quotactl(td, uap) > AUDIT_ARG_UID(uap->uid); > if (!prison_allow(td->td_ucred, PR_ALLOW_QUOTAS)) > return (EPERM); > + /* > + * Reference the mount point so that it will remain in core > + * across the VFS_QUOTACTL call. We used to vfs_busy() it > + * here, but now we leave that to the underlying file system. > + */ > NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE, > uap->path, td); > if ((error = namei(&nd)) != 0) > @@ -181,25 +186,8 @@ sys_quotactl(td, uap) > mp = nd.ni_vp->v_mount; > vfs_ref(mp); > vput(nd.ni_vp); > - error = vfs_busy(mp, 0); > - vfs_rel(mp); > - if (error != 0) > - return (error); > error = VFS_QUOTACTL(mp, uap->cmd, uap->uid, uap->arg); > - > - /* > - * Since quota on operation typically needs to open quota > - * file, the Q_QUOTAON handler needs to unbusy the mount point > - * before calling into namei. Otherwise, unmount might be > - * started between two vfs_busy() invocations (first is our, > - * second is from mount point cross-walk code in lookup()), > - * causing deadlock. > - * > - * Require that Q_QUOTAON handles the vfs_busy() reference on > - * its own, always returning with ubusied mount point. > - */ > - if ((uap->cmd >> SUBCMDSHIFT) != Q_QUOTAON) > - vfs_unbusy(mp); > + vfs_rel(mp); > return (error); > } > > diff --git a/sys/ufs/ufs/ufs_vfsops.c b/sys/ufs/ufs/ufs_vfsops.c > index 5bb73ea..fc5f5bb 100644 > --- a/sys/ufs/ufs/ufs_vfsops.c > +++ b/sys/ufs/ufs/ufs_vfsops.c > @@ -92,9 +92,6 @@ ufs_quotactl(mp, cmds, id, arg) > void *arg; > { > #ifndef QUOTA > - if ((cmds >> SUBCMDSHIFT) == Q_QUOTAON) > - vfs_unbusy(mp); > - > return (EOPNOTSUPP); > #else > struct thread *td; > @@ -115,21 +112,24 @@ ufs_quotactl(mp, cmds, id, arg) > break; > > default: > - if (cmd == Q_QUOTAON) > - vfs_unbusy(mp); > return (EINVAL); > } > } > if ((u_int)type >= MAXQUOTAS) { > - if (cmd == Q_QUOTAON) > - vfs_unbusy(mp); > return (EINVAL); > } > > + /* > + * Make sure we're not unmounting. > + */ > + error = vfs_busy(mp); > + if (error) > + return (error); > + > switch (cmd) { > case Q_QUOTAON: > error = quotaon(td, mp, type, arg); > - break; > + goto done; /* quotaon does vfs_unbusy itself */ > > case Q_QUOTAOFF: > error = quotaoff(td, mp, type); > @@ -171,6 +171,8 @@ ufs_quotactl(mp, cmds, id, arg) > error = EINVAL; > break; > } > + vfs_unbusy(mp); > +done: > return (error); > #endif > }