From owner-freebsd-fs@FreeBSD.ORG  Tue Sep  3 19:08:37 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id A0604652;
 Tue,  3 Sep 2013 19:08:37 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1])
 (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 66DA42FC2;
 Tue,  3 Sep 2013 19:08:37 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 21196B990;
 Tue,  3 Sep 2013 15:08:36 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-fs@freebsd.org
Subject: Re: Call fo comments - raising vfs.ufs.dirhash_reclaimage?
Date: Tue, 3 Sep 2013 15:07:32 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p28; KDE/4.5.5; amd64; ; )
References: <kvkvi7$iv7$1@ger.gmane.org> <20130828181228.0d3618dd@ernst.home>
 <CAF-QHFU80YC3W-k+TKM=y3JiVYi=1fp5CJjbCCk1y0VKXzcRQg@mail.gmail.com>
In-Reply-To: <CAF-QHFU80YC3W-k+TKM=y3JiVYi=1fp5CJjbCCk1y0VKXzcRQg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201309031507.33098.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Tue, 03 Sep 2013 15:08:36 -0400 (EDT)
Cc: freebsd-hackers <freebsd-hackers@freebsd.org>,
 Ivan Voras <ivoras@freebsd.org>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Sep 2013 19:08:37 -0000

On Wednesday, August 28, 2013 12:40:15 pm Ivan Voras wrote:
> On 28 August 2013 18:12, Gary Jennejohn <gljennjohn@googlemail.com> wrote:
> 
> > So, if I understand this correctly, a normal desktop user won't
> > notice any real change, except that buildworld might get faster,
> > and big servers will benefit?
> 
> Basically, yes, but read on...
> 
> > But could this negatively impact small, embedded systems, which
> > usually have only small memory footprints?  Although I suppose
> > one could argue that they usually don't have large numbers of
> > files cached in memory at any given time.
> 
> Unless I'm wrong, the only pathological case coming from this change
> would be the following sequence of events:
> 
> 1) Memory is scarce [*]
> 2) There's a sudden surge of requests for a huge number of different directories
> 3) There's an urgent lowmem event which is observed by dirhash, which
> attempts to free memory but is prevented in doing so for the next 60
> seconds because all entries are young (the idea behind dirhash being
> that if a directory is accessed, it will probably soon be accessed
> again - think "ls" then "fopen", so we won't evict him until
> reclaimage seconds)
> 4) the kernel runs out of memory, game over.

Just to play devil's advocate, the only way your change can benefit is
if:

1) Memory is scarce thus triggering a lowmem event
2) There are requests for a huge number of directories that haven't been
   accessed in over 5 seconds.

That is to say, what your change does is increase the relative importance
of dirhash memory relative to other memory in the machine when the machine
is under memory pressure.  If the machine is not under memory pressure then
the lowmem handler will not be triggered and your change will never matter.

Keep in mind that if pagedaemon is able to keep up, the lowmem event handler
will not be called.  This handler only triggers when you are really low on
memory and trying to allocate it faster than pagedaemon can reclaim free
pages.  In that sort of environment you generally want caches to return
pages sooner rather than later.

What would perhaps be better than a hardcoded reclaim age would be to use
an LRU-type approach and perhaps set a target percent to reclaim.  That is,
suppose you were to reclaim the oldest 10% of hashes on each lowmem call
(and make the '10%' the tunable value).  Then you will always make some amount
of progress in a low memory situation (and if the situation remains dire you
will eventually empty the entire cache), but the effective maximum age will
be more dynamic.  Right now if you haven't touched UFS in 5 seconds it
throws the entire thing out on the first lowmem event.  The LRU-approach would
only throw the oldest 10% out on the first call, but eventually throw it all out
if the situation remains dire.

-- 
John Baldwin