Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Jan 2002 16:59:23 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Peter Wemm <peter@wemm.org>
Cc:        cvs-committers@FreeBSD.ORG, cvs-all@FreeBSD.ORG
Subject:   Re: cvs commit: src/sys/kern vfs_subr.c 
Message-ID:  <200202010059.g110xNN79772@apollo.backplane.com>
References:   <20020201004018.857A63809@overcee.wemm.org>

next in thread | previous in thread | raw e-mail | index | archive | help
:This is still insufficient FYI.  On one hand, the reclaim should skip
:vnodes with *any* dirty pages and let syncer deal wit it.  On the other
:hand, that policy could be exploited.  But so can while(1) fork();
:
:At Yahoo, we were forced to hack around this problem.  We added a
:VNORECYCLE flag for *all* writeable MAP_SHARED file backed mmaps.  We could
:probably have used something with VOBJDIRTY instead.

    I don't see how it can be exploited.  We can't skip vnodes with
    any dirty pages... that is why kern.maxvnodes was being blown out
    in the first place on large-memory machines and why we had to implement
    the recycling code in the first place.

    The calculation it does guarentees that it will be able to find 
    enough vnodes to recycle.  It's a simple calculation:  If you have
    X pages of memory and maxvnodes is Y, then any vnode with more then
    X/Y pages can be skipped while still guarenteeing that we will 
    find enough vnodes to recycle to get us under the maxvnodes limit.

    The calculation I do adds a little '*2' slop, i.e. (X/Y)*2, to reduce
    the recycle codes workload, but it still seems quite reasonable.  If
    you think about it, a 4G machine has 1048576 pages.  If maxvnodes is
    30,000, then the calculation result is 68 pages.  Any vnode we 
    encounter with more then 68 pages is skipped, any vnode we encounter
    with less then 68 pages is recycled.

    It ought to work great for everyone, including Yahoo.

:I'm still uncomfortable with the fact that this is a clock-hand style vnode
:reclaim, rather than an LRU reclaim.  ie: it keeps track of where it is up
:to as it walks gradually over the entire list rather than purging the oldest
:vnodes each time.  In normal operation it will recycle much newer vnodes than
:the oldest we have laying around.
:
:Cheers,
:-Peter
:--
:Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au

    I have LRU code in there ready to go (see the vnlruvp() procedure),
    but we can't use it until the filesystem SYNCing code is fixed.
    Right now if I turn on that procedure the filesystem SYNCing code
    goes from O(N) to O(N^2) due to loop restarts.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200202010059.g110xNN79772>