From owner-freebsd-fs@FreeBSD.ORG Tue Sep 3 19:08:37 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id A0604652; Tue, 3 Sep 2013 19:08:37 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 66DA42FC2; Tue, 3 Sep 2013 19:08:37 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 21196B990; Tue, 3 Sep 2013 15:08:36 -0400 (EDT) From: John Baldwin To: freebsd-fs@freebsd.org Subject: Re: Call fo comments - raising vfs.ufs.dirhash_reclaimage? Date: Tue, 3 Sep 2013 15:07:32 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p28; KDE/4.5.5; amd64; ; ) References: <20130828181228.0d3618dd@ernst.home> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201309031507.33098.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 03 Sep 2013 15:08:36 -0400 (EDT) Cc: freebsd-hackers , Ivan Voras X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Sep 2013 19:08:37 -0000 On Wednesday, August 28, 2013 12:40:15 pm Ivan Voras wrote: > On 28 August 2013 18:12, Gary Jennejohn wrote: > > > So, if I understand this correctly, a normal desktop user won't > > notice any real change, except that buildworld might get faster, > > and big servers will benefit? > > Basically, yes, but read on... > > > But could this negatively impact small, embedded systems, which > > usually have only small memory footprints? Although I suppose > > one could argue that they usually don't have large numbers of > > files cached in memory at any given time. > > Unless I'm wrong, the only pathological case coming from this change > would be the following sequence of events: > > 1) Memory is scarce [*] > 2) There's a sudden surge of requests for a huge number of different directories > 3) There's an urgent lowmem event which is observed by dirhash, which > attempts to free memory but is prevented in doing so for the next 60 > seconds because all entries are young (the idea behind dirhash being > that if a directory is accessed, it will probably soon be accessed > again - think "ls" then "fopen", so we won't evict him until > reclaimage seconds) > 4) the kernel runs out of memory, game over. Just to play devil's advocate, the only way your change can benefit is if: 1) Memory is scarce thus triggering a lowmem event 2) There are requests for a huge number of directories that haven't been accessed in over 5 seconds. That is to say, what your change does is increase the relative importance of dirhash memory relative to other memory in the machine when the machine is under memory pressure. If the machine is not under memory pressure then the lowmem handler will not be triggered and your change will never matter. Keep in mind that if pagedaemon is able to keep up, the lowmem event handler will not be called. This handler only triggers when you are really low on memory and trying to allocate it faster than pagedaemon can reclaim free pages. In that sort of environment you generally want caches to return pages sooner rather than later. What would perhaps be better than a hardcoded reclaim age would be to use an LRU-type approach and perhaps set a target percent to reclaim. That is, suppose you were to reclaim the oldest 10% of hashes on each lowmem call (and make the '10%' the tunable value). Then you will always make some amount of progress in a low memory situation (and if the situation remains dire you will eventually empty the entire cache), but the effective maximum age will be more dynamic. Right now if you haven't touched UFS in 5 seconds it throws the entire thing out on the first lowmem event. The LRU-approach would only throw the oldest 10% out on the first call, but eventually throw it all out if the situation remains dire. -- John Baldwin