From owner-freebsd-current Thu Sep 12 6:50:48 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C08C837B400; Thu, 12 Sep 2002 06:50:43 -0700 (PDT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id E0ED443E42; Thu, 12 Sep 2002 06:50:42 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Received: from mousie.catspoiler.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.5/8.12.5) with ESMTP id g8CDoXwr000721; Thu, 12 Sep 2002 06:50:38 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Message-Id: <200209121350.g8CDoXwr000721@gw.catspoiler.org> Date: Thu, 12 Sep 2002 06:50:33 -0700 (PDT) From: Don Lewis Subject: Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself To: iedowse@maths.tcd.ie Cc: current@FreeBSD.ORG, jeff@FreeBSD.ORG In-Reply-To: <200209111224.aa37675@salmon.maths.tcd.ie> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 11 Sep, Ian Dowse wrote: > In message <200209110658.g8B6w2wr097059@gw.catspoiler.org>, Don Lewis writes: >> >>A potentially better solution just occurred to me. It looks like it >>would be better if vrele() waited to decrement v_usecount until *after* >>the call to VOP_INACTIVE() (and after the call to VI_LOCK()). If that >>were done, nfs_inactive() wouldn't need to muck with v_usecount at all. > > I've looked at this before; I think some filesystems (ufs anyway) > depend on v_usecount being 0 when VOP_INACTIVE() is called. Yup, though it would easy to tweak ufs_inactive() to expect the v_usecount to be one greater. A bigger problem is the call to vrecycle() in ufs_inactive(). > The > patch I have had lying around for quite a while is below. It adds > a vnode flag to avoid recursion into the last-reference handling > code in vrele/vput, which is the real problem. > > It also guarantees that a vnode will not be recycled during > VOP_INACTIVE(), so the nfs code no longer needs to hold an extra > reference in the first place. The flag manipulation code got a bit > messy after Jeff's vnode flag splitting work, so the patch could > probably be improved. If the need for nfs_inactive() to manipulate the reference count is eliminated, that should be sufficient to eliminate the vrele/vput recursion problem as well. > Whatever way this is done, we should try to avoid adding more hacks > to the nfs_inactive() code anyway. Tweaking nfs_inactive() has the advantage of being the least intrusive fix for this problem, but it isn't architecturally clean. I'd also argue that decrementing the vnode reference count to zero, but holding on to the reference for an extended period of time while doing I/O on the vnode is also a kludge. It also causes problems that require kludgey fixes, like the reference count manipulation in nfs_inactive(). For the most part the VI_INACTIVE flag is just another way of indicating that we are holding a reference to the vnode, so both the flag and the reference count need to be checked to see if the vnode is in use. The advantage of adding this flag versus just bumping the reference count is that the flag allows us to avoid the bugs in the current implementation while not requiring changes to code that expects the reference count to be zero in code called from VOP_INACTIVE(). Here are the three proposed fixes along with their advantages and disadvantages: Inline the v_usecount decrement code in nfs_inactive() + Least intrusive fix for the recursion bug - Adds a kludge to a kludge Call VOP_INACTIVE() before decrementing the reference count and tweak the foo_inactive() methods to expect the reference count to be one instead of zero + Eliminates the kludge from nfs_inactive() + We aren't lying about holding a reference, so hopefully less potential for bugs - It is not obvious to how to adapt ufs_inactive() Add VI_INACTIVE flag to prevent the vnode from being recycled + Eliminates the kludge from nfs_inactive() + Doesn't require changes to the other filesystems - Code complexity - Programmer needs to know when to look at VI_INACTIVE and when looking at just the reference count is ok After looking at ufs_inactive(), I'd like to add a fourth proposal that would involve splitting VOP_INACTIVE() into two separate methods. The first method would be called before the reference count is decremented and would do any I/O. The second method would be called after the reference count is decremented and could only be used for freeing resources, manipulating flags, putting it on the free list with vrecycle(), etc. + Eliminates the kludge from nfs_inactive() + We don't lie about holding a reference while doing I/O - More extensive code changes - Bikeshed arguments about what to call the two methods To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message