From owner-freebsd-stable@FreeBSD.ORG Sat Jun 12 14:59:36 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 53F40106566C for ; Sat, 12 Jun 2010 14:59:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 057288FC0C for ; Sat, 12 Jun 2010 14:59:35 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAJI9E0yDaFvH/2dsb2JhbACfA3G+HIJggjoE X-IronPort-AV: E=Sophos;i="4.53,408,1272859200"; d="scan'208";a="80429862" Received: from danube.cs.uoguelph.ca ([131.104.91.199]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 12 Jun 2010 10:59:33 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id DF0241084242; Sat, 12 Jun 2010 10:59:34 -0400 (EDT) X-Virus-Scanned: amavisd-new at danube.cs.uoguelph.ca Received: from danube.cs.uoguelph.ca ([127.0.0.1]) by localhost (danube.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9XQhs8R-1dLJ; Sat, 12 Jun 2010 10:59:33 -0400 (EDT) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id AEF33108422C; Sat, 12 Jun 2010 10:59:33 -0400 (EDT) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o5CFFnU05004; Sat, 12 Jun 2010 11:15:49 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Sat, 12 Jun 2010 11:15:49 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Kostik Belousov In-Reply-To: <20100612141549.GM13238@deviant.kiev.zoral.com.ua> Message-ID: References: <20100606144443.GA50876@emmi.physik-pool.tu-berlin.de> <8639wsk4t1.fsf@kopusha.home.net> <20100612141549.GM13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Leon Me??ner , freebsd-stable@freebsd.org, Mikolaj Golub Subject: Re: Re: freeBSD nullfs together with nfs and "silly rename" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Jun 2010 14:59:36 -0000 On Sat, 12 Jun 2010, Kostik Belousov wrote: > On Sat, Jun 12, 2010 at 11:56:10AM +0300, Mikolaj Golub wrote: >> >> On Sun, 6 Jun 2010 16:44:43 +0200 Leon Me??ner wrote: >> >> LM> Hi, >> LM> I hope this is not the wrong list to ask. Didn't get any answers on >> LM> -questions. >> >> LM> When you try to do the following inside a nullfs mounted directory, >> LM> where the nullfs origin is itself mounted via nfs you get an error: >> >> LM> # foo >> LM> # tail -f foo& >> LM> # rm -f foo >> LM> tail: foo: Stale NFS file handle >> LM> # fg >> >> LM> This is really a problem when running services inside jails and using >> LM> NFS as storage. As of [2] it looks like this problem is known for a >> LM> while. On a normal NFS mount this does not happen as "silly renaming" >> LM> [1] works there (producing nasty little .nfsXXXX files). >> >> nfs_sillyrename() is called when vnode's usecount is more then 1. It is >> expected that unlink() syscall increases vnode's usecount in namei() and if >> the file has been already opened usecount will be more then 1. >> >> But with nullfs layer present the reference counts are held by the upper node, >> not the lower (nfs) one, so when unlink() is called it increases usecount of >> the upper vnode, not nfs vnode and nfs_sillyrename() is never called. >> >> The strightforward solution looks like to implement null_remove() that will >> increase lower vnode's refcount before calling null_bypass() and then >> decrement it after the call. See the attached patch (it works for me on both >> 8-STABLE and CURRENT). > > The upper vnode holds a reference to the lower vnode, as you noted. > Now, with your patch, I believe that _all_ calls to the nfs_remove() > are happen with refcount > 1. > I'm not familiar with the nullfs so this might be way off, but would this patch be ok by any chance? Index: sys/fs/nullfs/null_vnops.c =================================================================== --- sys/fs/nullfs/null_vnops.c (revision 208960) +++ sys/fs/nullfs/null_vnops.c (working copy) @@ -499,6 +499,23 @@ } /* + * Increasing refcount of lower vnode is needed at least for the case + * when lower FS is NFS to do sillyrename if the file is in use. + */ +static int +null_remove(struct vop_remove_args *ap) +{ + int retval; + struct vnode *lvp; + + if (ap->a_vp->v_usecount > 1) { + lvp = NULLVPTOLOWERVP(ap->a_vp); + VREF(lvp); + } else + lvp = NULL; + retval = null_bypass(&ap->a_gen); + if (lvp != NULL) + vrele(lvp); + return (retval); +} + +/* * We handle this to eliminate null FS to lower FS * file moving. Don't know why we don't allow this, * possibly we should. @@ -809,6 +826,7 @@ .vop_open = null_open, .vop_print = null_print, .vop_reclaim = null_reclaim, + .vop_remove = null_remove, .vop_rename = null_rename, .vop_setattr = null_setattr, .vop_strategy = VOP_EOPNOTSUPP,