From owner-freebsd-hackers Fri Apr 13 4: 9: 6 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from rina.r.dl.itc.u-tokyo.ac.jp (rina.r.dl.itc.u-tokyo.ac.jp [133.11.199.247]) by hub.freebsd.org (Postfix) with ESMTP id BB8E337B424 for ; Fri, 13 Apr 2001 04:09:01 -0700 (PDT) (envelope-from tanimura@r.dl.itc.u-tokyo.ac.jp) Received: from rina.r.dl.itc.u-tokyo.ac.jp (localhost [127.0.0.1]) by rina.r.dl.itc.u-tokyo.ac.jp (8.11.3+3.4W/3.7W-rina.r-20010412) with ESMTP id f3DB8vZ49897 ; Fri, 13 Apr 2001 20:08:58 +0900 (JST) Message-Id: <200104131108.f3DB8vZ49897@rina.r.dl.itc.u-tokyo.ac.jp> Date: Fri, 13 Apr 2001 20:08:57 +0900 From: Seigo Tanimura To: bright@wintelcom.net Cc: tanimura@r.dl.itc.u-tokyo.ac.jp, phk@critter.freebsd.dk, dillon@earth.backplane.com, riel@conectiva.com.br, bsddiy@21cn.com, Tor.Egge@fast.no, freebsd-hackers@FreeBSD.ORG Subject: Re: vm balance In-Reply-To: In your message of "Fri, 13 Apr 2001 02:58:07 -0700" <20010413025806.A976@fw.wintelcom.net> References: <200104121757.f3CHvJd20639@earth.backplane.com> <59188.987108650@critter> <200104130939.f3D9d7Z37169@rina.r.dl.itc.u-tokyo.ac.jp> <20010413025806.A976@fw.wintelcom.net> Cc: Seigo Tanimura User-Agent: Wanderlust/1.1.1 (Purple Rain) SEMI/1.13.7 (Awazu) FLIM/1.13.2 (Kasanui) MULE XEmacs/21.1 (patch 14) (Cuyahoga Valley) (i386--freebsd) Organization: Digital Library Research Division, Information Techinology Centre, The University of Tokyo MIME-Version: 1.0 (generated by SEMI 1.13.7 - "Awazu") Content-Type: text/plain; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Fri, 13 Apr 2001 02:58:07 -0700, Alfred Perlstein said: Alfred> * Seigo Tanimura [010413 02:39] wrote: >> On Thu, 12 Apr 2001 22:50:50 +0200, >> Poul-Henning Kamp said: >> >> Poul-Henning> We keep namecache entries around as long as we can use them, and that >> Poul-Henning> generally means that recreating them is a rather expensive operation, >> Poul-Henning> involving creation of vnode and very likely a vm object again. >> >> Holding a namecache entry forever until its vnode is reused results in >> disaster when a huge number of files are accessed concurrently, causing >> active vnodes to eat up all of memory. This beast killed a box of mine >> with 3GB of memory and 200GB of a RAID0 disk array serving about >> 300,000 files by cvsupd and making the world a few months ago, when >> the number of the vnodes reached around 400,000 to make all of the >> processes wait for a free vnode. >> >> With a help by tegge, the box is now reclaiming directory vnodes when >> few free vnodes are available. Only directory vnodes holding no child >> directory vnodes held in v_cache_src are recycled, so that directory >> vnodes near the root of the filesystem hierarchy remain in namecache >> and directory vnodes are not reclaimed in cascade. The number of >> vnodes in the box is now about 135,000, staying quite steadily. >> >> Name'cache' is the place to hold vnodes for future use which may *not* >> come, hence vnodes held in namecache should be reclaimed in case of >> critical vnode shortage. Alfred> Are these changes planned for integration? Yes, but not very soon as there are a few kinds of works that should be done. One is that a directory vnode may be held as the working directory of a process, in which case we should not reclaim the directory vnode. Another is to determine how often namecache should be traversed to reclaim how many directory vnodes. At this moment, namecache is traversed for every 1,000 calls of getnewvnode(). If the following couple of inequalities satisfy, then up to 3,000 directory vnodes are attempted to be reclaimed: freevnodes < wantfreevnodes + 2 * 1000 (1) wantfreevnodes + 2 * 1000 < numvnodes * 2 (2) (1) means that we reclaim directory vnodes if the number of free vnodes are smaller than about 2,000. (2) is so that vnode reclaiming does not occur in the early stage of boot until the number of vnodes reaches around 2,000. Although I chose those parameters so that vnode reclaiming does not degrade the hit ratio of name lookup, they may not be optimum. Those parameters should be tunable via sysctl(2). Anyway, the patch can be found at: http://people.FreeBSD.org/~tanimura/patches/vnrecycle.diff -- Seigo Tanimura To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message