From owner-freebsd-current@FreeBSD.ORG Tue Jul 12 11:34:27 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFAA9106566C for ; Tue, 12 Jul 2011 11:34:27 +0000 (UTC) (envelope-from okuno.kohji@jp.panasonic.com) Received: from smtp.mei.co.jp (smtp.mei.co.jp [133.183.100.20]) by mx1.freebsd.org (Postfix) with ESMTP id 6635F8FC08 for ; Tue, 12 Jul 2011 11:34:27 +0000 (UTC) Received: from mail-gw.jp.panasonic.com ([157.8.1.157]) by smtp.mei.co.jp (8.12.11.20060614/3.7W/kc-maile14) with ESMTP id p6CBYPVI014498; Tue, 12 Jul 2011 20:34:25 +0900 (JST) Received: from epochmail.jp.panasonic.com ([157.8.1.130]) by mail.jp.panasonic.com (8.11.6p2/3.7W/kc-maili16) with ESMTP id p6CBYQ330492; Tue, 12 Jul 2011 20:34:26 +0900 Received: by epochmail.jp.panasonic.com (8.12.11.20060308/3.7W/lomi11) id p6CBYQBY019571; Tue, 12 Jul 2011 20:34:26 +0900 Received: from localhost by lomi11.jp.panasonic.com (8.12.11.20060308/3.7W) with ESMTP id p6CBYPXu019551; Tue, 12 Jul 2011 20:34:25 +0900 Date: Tue, 12 Jul 2011 20:34:25 +0900 (JST) Message-Id: <20110712.203425.702773873874569878.okuno.kohji@jp.panasonic.com> To: kostikbel@gmail.com From: Kohji Okuno In-Reply-To: <20110712111925.GH43872@deviant.kiev.zoral.com.ua> References: <20110712.191028.650619413057975749.okuno.kohji@jp.panasonic.com> <20110712111925.GH43872@deviant.kiev.zoral.com.ua> Organization: Panasonic Corporation X-Mailer: Mew version 6.3 on Emacs 23.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, okuno.kohji@jp.panasonic.com Subject: Re: Bug about devfs? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jul 2011 11:34:27 -0000 Hello, From: Kostik Belousov Subject: Re: Bug about devfs? Date: Tue, 12 Jul 2011 14:19:25 +0300 Message-ID: <20110712111925.GH43872@deviant.kiev.zoral.com.ua> > Thank you for the report. > > The proposed change would revert r179247, which also caused some issues. > Are you able to reproduce the problem ? > > Could you try the following patch ? I cannot reproduce your situation, > so the patch is untested by me. Thank you for quick your comment. I think that your change is beter than mine. I will test it, and I will report the result. > On Tue, Jul 12, 2011 at 07:10:28PM +0900, Kohji Okuno wrote: >> Hello, >> >> I think that devfs has a problem. >> I encountered the problem that open("/dev/AAA") returned ENOENT. >> Of course, /dev/AAA exists. >> >> ENOENT was created by the point(***) in devfs_allocv(). >> I think that the race condition had occurred between process A and >> vnlru kernel thread. >> >> Please check the following. >> >> If vnlru set VI_DOOMED to vp->v_iflag but vnlru didn't still execute >> VOP_RECLAIM(), process A cat get valid vp from de->de_vnode. >> But, vget() will return ENOENT, because vp->v_iflag has VI_DOOMED. >> >> When I set the break point to (***), I checked that de->de_vnode and >> vp->v_data were NULL. >> >> >> process A: vnlru: >> >> devfs_allocv() { >> vgonel(vp) { >> ... ... >> vp->v_iflag |= VI_DOOMED; >> mtx_lock(&devfs_de_interlock); ... >> vp = de->de_vnode; >> if (vp != NULL) { VI_UNLOCK(vp); >> _____________/ ... >> VI_LOCK(vp); ____________/ if (VOP_RECLAIM(vp, td)) >> mtx_unlock(&devfs_de_interlock); ... >> ... \ devfs_reclaim(ap) { >> error = vget(vp,...); \ >> ... \______ mtx_lock(&devfs_de_interlock); >> if (devfs_allocv_drop_refs(...)) { de = vp->v_data; >> ... if (de != NULL) { >> } de->de_vnode = NULL; >> else if (error) { vp->v_data = NULL; >> ... } >> rturn (error); (***) mtx_unlock(&devfs_de_interlock); >> } ... >> } >> >> >> >> I think that devfs_allocv() should be fixed as below. >> How do you think? >> >> devfs_allocv(struct devfs_dirent *de, struct mount *mp, struct vnode **vpp) >> { >> int error; >> struct vnode *vp; >> struct cdev *dev; >> struct devfs_mount *dmp; >> >> dmp = VFSTODEVFS(mp); >> +#if 1 >> + retry: >> +#endif >> if (de->de_flags & DE_DOOMED) { >> >> ... >> >> mtx_lock(&devfs_de_interlock); >> vp = de->de_vnode; >> if (vp != NULL) { >> VI_LOCK(vp); >> mtx_unlock(&devfs_de_interlock); >> sx_xunlock(&dmp->dm_lock); >> error = vget(vp, LK_EXCLUSIVE | LK_INTERLOCK, curthread); >> sx_xlock(&dmp->dm_lock); >> if (devfs_allocv_drop_refs(0, dmp, de)) { >> if (error == 0) >> vput(vp); >> return (ENOENT); >> } >> else if (error) { >> +#if 1 >> + if (error == ENOENT) >> + goto retry; >> +#endif >> sx_xunlock(&dmp->dm_lock); >> return (error); >> } >> > Thank you for the report. > > The proposed change would revert r179247, which also caused some issues. > Are you able to reproduce the problem ? > > Could you try the following patch ? I cannot reproduce your situation, > so the patch is untested by me. > > diff --git a/sys/fs/devfs/devfs_vnops.c b/sys/fs/devfs/devfs_vnops.c > index bf6dab8..bbbfff4 100644 > --- a/sys/fs/devfs/devfs_vnops.c > +++ b/sys/fs/devfs/devfs_vnops.c > @@ -397,6 +397,7 @@ devfs_allocv(struct devfs_dirent *de, struct mount *mp, int lockmode, > sx_xunlock(&dmp->dm_lock); > return (ENOENT); > } > +loop: > DEVFS_DE_HOLD(de); > DEVFS_DMP_HOLD(dmp); > mtx_lock(&devfs_de_interlock); > @@ -412,7 +413,16 @@ devfs_allocv(struct devfs_dirent *de, struct mount *mp, int lockmode, > vput(vp); > return (ENOENT); > } > - else if (error) { > + else if (error != 0) { > + if (error == ENOENT) { > + mtx_lock(&devfs_de_interlock); > + while (de->de_vnode != NULL) { > + msleep(&de->de_vnode, > + &devfs_de_interlock, 0, "dvall", 0); > + } > + mtx_unlock(&devfs_de_interlock); > + goto loop; > + } > sx_xunlock(&dmp->dm_lock); > return (error); > } > @@ -1259,6 +1269,7 @@ devfs_reclaim(struct vop_reclaim_args *ap) > if (de != NULL) { > de->de_vnode = NULL; > vp->v_data = NULL; > + wakeup(&de->de_vnode); > } > mtx_unlock(&devfs_de_interlock); > > >