From owner-freebsd-hackers@FreeBSD.ORG  Sun Jan 15 22:45:21 2006
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: hackers@freebsd.org
Delivered-To: freebsd-hackers@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8504716A41F;
	Sun, 15 Jan 2006 22:45:21 +0000 (GMT) (envelope-from frank@exit.com)
Received: from tinker.exit.com (tinker.exit.com [206.223.0.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 01EC043D5A;
	Sun, 15 Jan 2006 22:45:14 +0000 (GMT) (envelope-from frank@exit.com)
Received: from realtime.exit.com (realtime [206.223.0.5])
	by tinker.exit.com (8.13.4/8.13.4) with ESMTP id k0FMjDe6039088;
	Sun, 15 Jan 2006 14:45:13 -0800 (PST) (envelope-from frank@exit.com)
Received: from realtime.exit.com (localhost [127.0.0.1])
	by realtime.exit.com (8.13.4/8.13.4) with ESMTP id k0FMjCXx002584;
	Sun, 15 Jan 2006 14:45:12 -0800 (PST) (envelope-from frank@exit.com)
Received: (from frank@localhost)
	by realtime.exit.com (8.13.4/8.13.4/Submit) id k0FMjB9c002583;
	Sun, 15 Jan 2006 14:45:11 -0800 (PST) (envelope-from frank@exit.com)
X-Authentication-Warning: realtime.exit.com: frank set sender to
	frank@exit.com using -f
From: Frank Mayhar <frank@exit.com>
To: hackers@freebsd.org
In-Reply-To: <1137357602.1362.23.camel@realtime.exit.com>
References: <1137357602.1362.23.camel@realtime.exit.com>
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
Organization: Exit Consulting
Date: Sun, 15 Jan 2006 14:45:11 -0800
Message-Id: <1137365111.1942.18.camel@realtime.exit.com>
Mime-Version: 1.0
X-Mailer: Evolution 2.4.1 FreeBSD GNOME Team Port 
X-Virus-Scanned: ClamAV 0.87.1/1243/Sun Jan 15 10:35:18 2006 on tinker.exit.com
X-Virus-Status: Clean
Cc: FreeBSD-Current <current@freebsd.org>
Subject: Re: Panic in nfs_putpages() on 6-stable, more info.
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: frank@exit.com
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Jan 2006 22:45:21 -0000

A bit more data and another question.

On Sun, 2006-01-15 at 12:40 -0800, Frank Mayhar wrote:
> In nfs_reclaim(), just before he calls vnode_destroy_vobject(), he
> zfrees and clears vp->v_data.  When, down in the guts of vm_object.c, he
> tries to flush the associated pages, v_data is already NULL so he goes
> boom.
> 
> Now, why does he do the zfree/clear before vnode_destroy_vobject()?  Is
> he assuming that there are no pages associated with this vnode that need
> to be flushed?  Should there be? I looked at some other file systems and
> they do the same thing.  The obvious fix is to move the zfree/clear to
> after the vnode_destroy_vobject() but if there should be no pages that
> need to be flushed on the vnode at this point, that would just hide the
> problem.

Looking further down, at vlrureclaim(), I see that the commentary for
vlrureclaim() specifically says that a a flushed vnode may still have
backing store, so it appears that yes, there may be pages associated
with the vnode when he calls vgonel().  Between vgonel() and
nfs_reclaim() there's just VOP stuff, so the flushing has to be done
lower down.  The nfs_reclaim() routine itself just does some bookkeeping
and then calls vnode_destroy_vobject().  That routine can push pages
out, which means that if the backing store is on NFS, nfs_putpages() can
be called.  But that routine will fault because he'll try to use v_data
as an nfsnode.

The reason for my confusion is that of the filesystems in the tree, the
only one that doesn't zfree and clear v_data before calling
vnode_destroy_vobject() is UFS.  The commentary in ufs_reclaim() is
clear, though:

        /*
         * Destroy the vm object and flush associated pages.
         */
        vnode_destroy_vobject(vp);

Then later he VI_LOCKS() and clears v_data.  (And [indirectly] does the
zfree only _after_ that, which is interesting but probably not
important.)

I'm going to go slightly out on a limb here and guess that the "flush
associated pages" thing came in relatively recently and the other
filesystems haven't caught up with it.  This implies that the proper fix
is to go through those other xxx_reclaim() routines and reorder the
operations.

That's easy enough to do, but I would like to make sure that my
understanding of this (and my guess) is correct and that I'm not wasting
my time.

Thanks!
-- 
Frank Mayhar frank@exit.com     http://www.exit.com/
Exit Consulting                 http://www.gpsclock.com/
                                http://www.exit.com/blog/frank/