Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Mar 2005 03:57:47 -0500 (EST)
From:      Jeff Roberson <jroberson@chesapeake.net>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        arch@freebsd.org
Subject:   Re: Freeing vnodes.
Message-ID:  <20050315035032.T20708@mail.chesapeake.net>
In-Reply-To: <200503150729.j2F7Tdxl052412@apollo.backplane.com>
References:  <20050314213038.V20708@mail.chesapeake.net> <200503150729.j2F7Tdxl052412@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Mon, 14 Mar 2005, Matthew Dillon wrote:

> :I have a patch at http://www.chesapeake.net/~jroberson/freevnodes.diff
> :that allows us to start reclaiming vnodes from the free list and release
> :their memory.  It also changes the semantics of wantfreevnodes, and makes
> :getnewvnode() much prettier.
>
>     Hmm.   I'm not sure your logic is correct in this bit:
>
>     +	/*
>     +	 * We try to get half way to wantfreevnodes each time we run.  This
>     +	 * slows down the process slightly, giving vnodes a greater chance
>     +	 * of being lru'd to the back of the list.
>     +	 */
>     +	count = (freevnodes - wantfreevnodes) / 2;
>     +	for (; count > 0; count--) {
> 	    ...
>
>     This seems to be indicating that you are going to try to destroy
>     *MOST* of the vnodes on the free list.  freevnodes is typically
>     a fairly high number what wantfreevnodes is typically a very low
>     number, so this calculation is typically going to be, well, a big
>     number.
>
>     I think you did not intend this.  Didn't you just want to destroy
>     enough vnodes to have 'wantfreevnodes' worth of slop so getnewvnode()
>     could allocate new vnodes?  In that case the calculation would be:

On my system wantfreevnodes is at 2500.  Let's say I have 4500 free
vnodes.  4500 - 2500 = 2000.  Divide by 2 gives you 1000.  I don't think
you read the whole patch.


>
> 	count = numvnodes - desiredvnodes + wantfreevnodes;
>
> 	while (count > 0) {
> 		...
> 		physically free a vnode and reduce the numvnodes count
> 		...
> 		--count;
> 	}
>
> :This also removes the recycling from the getnewvnode() path.  Instead, it
> :is done by a new helper function that is called from vnlru_proc().  This
> :function just frees vnodes from the head of the list until we reach our
> :wantfreevnodes target.
> :
> :I haven't perf tested this yet, but I have a box that is doing a
> :buildworld with a fairly constant freevnodes count which shows that vnodes
> :are actually being uma_zfree'd.
> :
> :Comments?  Anyone willing to do some perf tests for me?
> :
> :Thanks,
> :Jeff
>
>     There is a second issue when you remove the recycling from the
>     getnewvnode() path.  Generally speaking when the system runs out of
>     vnodes you wind up with a LOT of processes sleeping in the getnewvnode()
>     procedure all at once.   You need to make very sure that you are actually
>     freeing up a sufficient number of vnodes to allow *all* the processes
>     waiting for a new vnode to proceed all at once.  This would argue
>     against only going 'half way' to the goal, and would also argue for
>     having a more dynamic goal that is based on the load on getnewvnode().
>
>     Even worse, since processes open and close files all the time, there
>     can be a HUGE load on getnewvnode() (think of cvsupd and find, or
>     a cvs update, etc...).  This load can easily outstrip vnlru_proc()'s
>     new ability to free vnodes and potentially cause a lot of unnecessarily
>     blockages.

We have one buf daemon, one page daemon, one syncer, one vnlru proc, etc.
In all these cases it would be nice if they gained new contexts when they
had a lot of work to do, but they don't, and it doesn't seem to be a huge
problem today.  On my system one vnlruproc easily keeps up with the job of
freeing free vnodes.  Remember these vnodes have no pages associated with
them, so at most you're freeing an inode for a deleted file, and in the
common case the whole operation runs on memory without blocking for io.

We presently single thread the most critical case, where we have no free
vnodes and are not allowed to allocate any more while we wait for
vnlru_proc() to do io on vnodes with cached pages to reclaim some.  I'm
not convinced this is a real problem.

>
>     I love the idea of being able to free vnodes in vnlru_proc() rather
>     then free-and-reuse them in allocvnode(), but I cannot figure out how
>     vnlru_proc() could possibly adapt to the huge load range that
>     getnewvnode() has to deal with.  Plus keep in mind that the vnodes
>     being reused at that point are basically already dead except for
>     the vgonel().
>
>     This brings up the true crux of the problem, where the true overhead
>     of reusing a vnode inline with the getnewvnode() call is... and that
>     is that vgonel() potentially has to update the related inode and could
>     cause an unrelated process to block inside getnewvnode().  But even

Yes, this is kind of gross, and would cause lock order problems except
that we LK_NOWAIT on the vn lock in vtryrecycle().  It'd be better if we
didn't try doing io on unrelated vnodes while this deep in the stack.

>     this case is a hard sell because the buffer cache for the filesystem
>     almost certainly has cached the inode's disk block and a stat() alone
>     does not actually dirty an inode, you usually have to read() from the
>     file too.  I would argue that if anything needs fixing here it would
>     be the 'late' inode sync.  [note: inode syncing can occur in INACTIVE
>     as well as RECLAIM, and I'm not sure which routine it occurs in
>     'most of the time'].
>
> 					-Matt
> 					Matthew Dillon
> 					<dillon@backplane.com>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050315035032.T20708>