From owner-freebsd-arch  Sat Feb 23  8:55:38 2002
Delivered-To: freebsd-arch@freebsd.org
Received: from gull.prod.itd.earthlink.net (gull.mail.pas.earthlink.net [207.217.120.84])
	by hub.freebsd.org (Postfix) with ESMTP id 774CC37B417
	for <arch@freebsd.org>; Sat, 23 Feb 2002 08:55:34 -0800 (PST)
Received: from pool0212.cvx22-bradley.dialup.earthlink.net ([209.179.198.212] helo=mindspring.com)
	by gull.prod.itd.earthlink.net with esmtp (Exim 3.33 #1)
	id 16efSI-0005nN-00; Sat, 23 Feb 2002 08:55:26 -0800
Message-ID: <3C77C972.598B4181@mindspring.com>
Date: Sat, 23 Feb 2002 08:55:14 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony}  (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>
Cc: arch@FreeBSD.org
Subject: Re: reclaiming v_data of free vnodes
References: <200202231556.g1NFu9N9040749@silver.carrots.uucp.r.dl.itc.u-tokyo.ac.jp>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Seigo Tanimura wrote:
> which is almost the same as the number of the total vnodes. (in
> cvsup.jp.FreeBSD.org, almost all in-use vnodes are actually inodes)
> 
> This seems due to vrele() and vput() not calling VOP_RECLAIM(). One
> solution is to always reclaim a vnode in vrele()/vput(), while we can
> also run a kernel thread to scan the free vnodes and reclaim some of
> them. Which one would be better, or are there any other ways?
> 
> Any comments are welcome.

Taking the first quoted sentence, above, into account, it
seems that the SVR4 approach of giving the vnode tot the FS
to manage, instead of using a system wide pool, would be
the best approach.

That way the vnode and in core inode allocation are combined
(in fact, probably the same "malloc").  It also inherently
parallelizes allocation when two or more FS's are in use,
which in turn starts to work around one of the big problems
facing the concurrency push-down in SMPng.

The ihash cache really needs to go away -- or at least, it
needs to keep the page associations, so that clean pages
that are in core don't get faulted in again because of a
disassociated vnode/inode pair when the inode is reclaimed.
This is also at the root of your problem, where there is no
explicit reclaim possible because the vnode and inode are
not tightly coupled: you can free resources used by one and
not the other, but have no way to reclaim the relationship
from the vnode dise (the inode side can reclaim the relationship,
but loses the cached page associations).

Frankly, it makes sense to me, even if the system retains
ownership of the vnode, to find a way to pass the vput/vrele
notification to the VFS via a VOP_VRELE (or something similar),
so that there is explicit notification.  THis is pretty much
a requirement for more than trivial stacking, at some point,
unless you want to greatly complicate each of the stacking
layers.

Combining would also help in the nfsnode cases, I think.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message