Date: Sat, 12 Jun 2010 11:15:49 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Kostik Belousov <kostikbel@gmail.com> Cc: Leon Me??ner <l.messner@physik.tu-berlin.de>, freebsd-stable@freebsd.org, Mikolaj Golub <to.my.trociny@gmail.com> Subject: Re: Re: freeBSD nullfs together with nfs and "silly rename" Message-ID: <Pine.GSO.4.63.1006121112150.4460@muncher.cs.uoguelph.ca> In-Reply-To: <20100612141549.GM13238@deviant.kiev.zoral.com.ua> References: <20100606144443.GA50876@emmi.physik-pool.tu-berlin.de> <8639wsk4t1.fsf@kopusha.home.net> <20100612141549.GM13238@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 12 Jun 2010, Kostik Belousov wrote: > On Sat, Jun 12, 2010 at 11:56:10AM +0300, Mikolaj Golub wrote: >> >> On Sun, 6 Jun 2010 16:44:43 +0200 Leon Me??ner wrote: >> >> LM> Hi, >> LM> I hope this is not the wrong list to ask. Didn't get any answers on >> LM> -questions. >> >> LM> When you try to do the following inside a nullfs mounted directory, >> LM> where the nullfs origin is itself mounted via nfs you get an error: >> >> LM> # foo >> LM> # tail -f foo& >> LM> # rm -f foo >> LM> tail: foo: Stale NFS file handle >> LM> # fg >> >> LM> This is really a problem when running services inside jails and using >> LM> NFS as storage. As of [2] it looks like this problem is known for a >> LM> while. On a normal NFS mount this does not happen as "silly renaming" >> LM> [1] works there (producing nasty little .nfsXXXX files). >> >> nfs_sillyrename() is called when vnode's usecount is more then 1. It is >> expected that unlink() syscall increases vnode's usecount in namei() and if >> the file has been already opened usecount will be more then 1. >> >> But with nullfs layer present the reference counts are held by the upper node, >> not the lower (nfs) one, so when unlink() is called it increases usecount of >> the upper vnode, not nfs vnode and nfs_sillyrename() is never called. >> >> The strightforward solution looks like to implement null_remove() that will >> increase lower vnode's refcount before calling null_bypass() and then >> decrement it after the call. See the attached patch (it works for me on both >> 8-STABLE and CURRENT). > > The upper vnode holds a reference to the lower vnode, as you noted. > Now, with your patch, I believe that _all_ calls to the nfs_remove() > are happen with refcount > 1. > I'm not familiar with the nullfs so this might be way off, but would this patch be ok by any chance? Index: sys/fs/nullfs/null_vnops.c =================================================================== --- sys/fs/nullfs/null_vnops.c (revision 208960) +++ sys/fs/nullfs/null_vnops.c (working copy) @@ -499,6 +499,23 @@ } /* + * Increasing refcount of lower vnode is needed at least for the case + * when lower FS is NFS to do sillyrename if the file is in use. + */ +static int +null_remove(struct vop_remove_args *ap) +{ + int retval; + struct vnode *lvp; + + if (ap->a_vp->v_usecount > 1) { + lvp = NULLVPTOLOWERVP(ap->a_vp); + VREF(lvp); + } else + lvp = NULL; + retval = null_bypass(&ap->a_gen); + if (lvp != NULL) + vrele(lvp); + return (retval); +} + +/* * We handle this to eliminate null FS to lower FS * file moving. Don't know why we don't allow this, * possibly we should. @@ -809,6 +826,7 @@ .vop_open = null_open, .vop_print = null_print, .vop_reclaim = null_reclaim, + .vop_remove = null_remove, .vop_rename = null_rename, .vop_setattr = null_setattr, .vop_strategy = VOP_EOPNOTSUPP,
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.63.1006121112150.4460>