From owner-freebsd-current@FreeBSD.ORG  Sun Dec 14 16:16:13 2003
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id E482216A4CE; Sun, 14 Dec 2003 16:16:12 -0800 (PST)
Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 993DD43D35; Sun, 14 Dec 2003 16:16:09 -0800 (PST)
	(envelope-from truckman@FreeBSD.org)
Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2])
	by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id hBF0FPeF066032;
	Sun, 14 Dec 2003 16:15:47 -0800 (PST)
	(envelope-from truckman@FreeBSD.org)
Message-Id: <200312150015.hBF0FPeF066032@gw.catspoiler.org>
Date: Sun, 14 Dec 2003 16:15:25 -0800 (PST)
From: Don Lewis <truckman@FreeBSD.org>
To: jroberson@chesapeake.net
In-Reply-To: <20031214134007.F4201-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/plain; charset=us-ascii
cc: mckusick@mckusick.com
cc: alc@FreeBSD.org
cc: mb@imp.ch
cc: freebsd-current@FreeBSD.org
Subject: Re: HAVE TRACE & DDB Re: FreeBSD 5.2-RC1 released
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Dec 2003 00:16:13 -0000

On 14 Dec, Jeff Roberson wrote:
> 
> 
> On Sun, 14 Dec 2003, Jeff Roberson wrote:
> 
>>
>> On Sun, 14 Dec 2003, Jeff Roberson wrote:
>>
>> > On Sat, 13 Dec 2003, Don Lewis wrote:
>> >
>> > > On 13 Dec, Don Lewis wrote:
>> > > > On 12 Dec, Jeff Roberson wrote:
>> > > >
>> > > >
>> > > >> fsync: giving up on dirty: 0xc4e18000: tag devfs, type VCHR, usecount 44,
>> > > >> writecount 0, refcount 14, flags (VI_XLOCK|VV_OBJBUF), lock type devfs: EXCL
>> > > >> (count 1) by thread 0xc20ff500
>> > > >
>> > > > Why are we trying to reuse a vnode with a usecount of 44 and a refcount
>> > > > of 14?  What is thread 0xc20ff500 doing?
>> > >
>> > > Following up to myself ...
>> > >
>> > > It looks like we're trying to recycle this vnode because of the
>> > > following sysinstall code, in distExtractTarball():
>> > >
>> > >     if (is_base && RunningAsInit && !Fake) {
>> > >         unmounted_dev = 1;
>> > >         unmount("/dev", MNT_FORCE);
>> > >     } else
>> > >         unmounted_dev = 0;
>> > >
>> > > What happens if we forceably umount /dev while /dev/whatever holds a
>> > > mounted file system?  It looks like this is handled by vgonechrl().  It
>> > > looks to me like vclean() is going to do some scary stuff to this vnode.
>> > >
>> >
>> > Excellent work!  I think I may know what's wrong.  If you look at rev
>> > 1.461 of vfs_subr.c I changed the semantics of cleaning a VCHR that was
>> > being unmounted.  I now acquire the xlock around the operation.  This may
>> > be the culprit.  I'm too tired to debug this right now, but I can look at
>> > it in the am.
>> >
>>
>> Ok, I think I understand what happens..  The syncer runs, and at the same
>> time, we're doing the forced unmount.  This causes the sync of the device
>> vnode to fail.  This isn't really a problem.  After this, while syncing
>> a ffs volume that is mounted on a VCHR from /dev, we bread() and get a
>> buffer for this device and then immediately block.  The forced unmount
>> then proceeds, calling vclean() on the device, which goes into the VM via
>> DESTROYVOBJECT.  The VM frees all of the pages associated with the object
>> etc.  Then, the ffs_update() is allowed to run again with a pointer to a
>> buffer that has pointers to pages that have been freed.  This is where
>> vfs_setdirty() comes in and finds a NULL object.
>>
>> The wired counts on the pages are 1, which is consistent with a page in
>> the bufcache.  Also the object is NULL which is the only indication we
>> have that this is a free page.
>>
>> I think that if we want to allow unmounting of the underlying device for
>> VCHR, we need to not call vclean() from vgonechr().  We need to just lock,
>> VOP_RECLAIM, cache_purge(), and insmntque to NULL.
>>
>> I've looked through my changes here, and I don't see how I could have
>> introduced this bug.  Were we vclean()ing before, and that seems to be the
>> main problem.  There have been some changes to device aliasing that could
>> have impacted this.  I'm trying to get the scoop from phk now.

I'm very suspicious of device aliasing.

>> I'm going to change the way vgonechrl() works, but I'd really like to know
>> what changed that broke this..
>>
> 
> Please test the patch at:
> http://www.chesapeake.net/~jroberson/forcevchr.diff

This looks about right.