From owner-freebsd-net@FreeBSD.ORG Fri Jul 22 13:44:43 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C80D91065742; Fri, 22 Jul 2011 13:44:43 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 413C48FC1F; Fri, 22 Jul 2011 13:44:42 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtYAAIx9KU6DaFvO/2dsb2JhbABHDIRMky2QR7MpkQ+BK4F7LAKBXIEPBJJukHs X-IronPort-AV: E=Sophos;i="4.67,247,1309752000"; d="scan'208";a="131945797" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 22 Jul 2011 09:44:42 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 3CC2EB3F4D; Fri, 22 Jul 2011 09:44:42 -0400 (EDT) Date: Fri, 22 Jul 2011 09:44:42 -0400 (EDT) From: Rick Macklem To: Kostik Belousov Message-ID: <1600540075.883123.1311342282234.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20110722131159.GR17489@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: Jeremiah Lott , freebsd-net@freebsd.org, John Baldwin , rmacklem@freebsd.org Subject: Re: LOR with nfsclient "sillyrename" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 13:44:43 -0000 Kostik Belousov wrote: > On Fri, Jul 22, 2011 at 08:55:10AM -0400, John Baldwin wrote: > > On Thursday, July 21, 2011 4:19:59 pm Jeremiah Lott wrote: > > > We're seeing nfsclient deadlocks with what looks like lock order > > > reversal after removing a "silly rename". It is fairly rare, but > > > we've seen it > > happen a few times. I included relevant back traces from an > > occurrence. From what I can see, nfs_inactive() is called with the > > vnode locked. If > > there is a silly-rename, it will call vrele() on its parent > > directory, which can potentially try to lock the parent directory. > > Since this is the > > opposite order of the lock acquisition in lookup, it can deadlock. > > This happened in a FreeBSD7 build, but I looked through freebsd head > > and > > didn't see any change that addressed this. Anyone seen this before? > > > > I haven't seen this before, but your analysis looks correct to me. > > > > Perhaps the best fix would be to defer the actual freeing of the > > sillyrename > > to an asynchronous task? Maybe something like this (untested, > > uncompiled): > > > > Index: nfsclient/nfsnode.h > > =================================================================== > > --- nfsclient/nfsnode.h (revision 224254) > > +++ nfsclient/nfsnode.h (working copy) > > @@ -36,6 +36,7 @@ > > #ifndef _NFSCLIENT_NFSNODE_H_ > > #define _NFSCLIENT_NFSNODE_H_ > > > > +#include > > #if !defined(_NFSCLIENT_NFS_H_) && !defined(_KERNEL) > > #include > > #endif > > @@ -45,8 +46,10 @@ > > * can be removed by nfs_inactive() > > */ > > struct sillyrename { > > + struct task s_task; > > struct ucred *s_cred; > > struct vnode *s_dvp; > > + struct vnode *s_vp; > > int (*s_removeit)(struct sillyrename *sp); > > long s_namlen; > > char s_name[32]; > > Index: nfsclient/nfs_vnops.c > > =================================================================== > > --- nfsclient/nfs_vnops.c (revision 224254) > > +++ nfsclient/nfs_vnops.c (working copy) > > @@ -1757,7 +1757,6 @@ > > { > > /* > > * Make sure that the directory vnode is still valid. > > - * XXX we should lock sp->s_dvp here. > > */ > > if (sp->s_dvp->v_type == VBAD) > > return (0); > > @@ -2754,8 +2753,10 @@ > > M_NFSREQ, M_WAITOK); > > sp->s_cred = crhold(cnp->cn_cred); > > sp->s_dvp = dvp; > > + sp->s_vp = vp; > > sp->s_removeit = nfs_removeit; > > VREF(dvp); > > + vhold(vp); > > > > /* > > * Fudge together a funny name. > > Index: nfsclient/nfs_node.c > > =================================================================== > > --- nfsclient/nfs_node.c (revision 224254) > > +++ nfsclient/nfs_node.c (working copy) > > @@ -47,6 +47,7 @@ > > #include > > #include > > #include > > +#include > > #include > > > > #include > > @@ -185,6 +186,26 @@ > > return (0); > > } > > > > +static void > > +nfs_freesillyrename(void *arg, int pending) > > +{ > > + struct sillyrename *sp; > > + > > + sp = arg; > > + vn_lock(sp->s_dvp, LK_SHARED | LK_RETRY); > I think taking an exclusive lock is somewhat more clean. > > + vn_lock(sp->s_vp, LK_EXCLUSIVE | LK_RETRY); > I believe that you have to verify that at least dvp is not doomed. > > Due to this, I propose to only move the vrele() call to taskqueue. Yes. I was thinking that it would be simpler (and I'm a chicken about changing more than I have to for these kinds of things:-) to juts defer the vrele(). I wasn't sure that holding onto "vp" when it was being recycled was such a good plan, although I'm not saying it would actually break anything. (As I understand it, VOP_INACTIVE() sometimes gets delayed until just before VOP_RECLAIM() and doing a VHOLD(vp) in there just seems like it's asking for trouble?;-) I'll post with a patch, once I've tested something. > > + (void)nfs_vinvalbuf(ap->a_vp, 0, td, 1); > > + /* > > + * Remove the silly file that was rename'd earlier > > + */ > > + (sp->s_removeit)(sp); > > + crfree(sp->s_cred); > > + vput(sp->s_dvp); > > + VOP_UNLOCK(sp->s_vp, 0); > > + vdrop(sp->s_vp); > > + free((caddr_t)sp, M_NFSREQ); > > +} > > + > > int > > nfs_inactive(struct vop_inactive_args *ap) > > { > > @@ -200,15 +221,9 @@ > > } else > > sp = NULL; > > if (sp) { > > + TASK_INIT(&sp->task, 0, nfs_freesillyrename, sp); > > + taskqueue_enqueue(taskqueue_thread, &sp->task); > > mtx_unlock(&np->n_mtx); > > - (void)nfs_vinvalbuf(ap->a_vp, 0, td, 1); > > - /* > > - * Remove the silly file that was rename'd earlier > > - */ > > - (sp->s_removeit)(sp); > > - crfree(sp->s_cred); > > - vrele(sp->s_dvp); > > - free((caddr_t)sp, M_NFSREQ); > > mtx_lock(&np->n_mtx); > > } > > np->n_flag &= NMODIFIED; > > Thanks everyone, for the helpful suggestions, rick