From owner-svn-doc-all@freebsd.org Sat Jan 9 19:23:43 2016 Return-Path: Delivered-To: svn-doc-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8114EA69DD8; Sat, 9 Jan 2016 19:23:43 +0000 (UTC) (envelope-from bjk@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3E5971D59; Sat, 9 Jan 2016 19:23:43 +0000 (UTC) (envelope-from bjk@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u09JNg6g096214; Sat, 9 Jan 2016 19:23:42 GMT (envelope-from bjk@FreeBSD.org) Received: (from bjk@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u09JNgi2096213; Sat, 9 Jan 2016 19:23:42 GMT (envelope-from bjk@FreeBSD.org) Message-Id: <201601091923.u09JNgi2096213@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: bjk set sender to bjk@FreeBSD.org using -f From: Benjamin Kaduk Date: Sat, 9 Jan 2016 19:23:42 +0000 (UTC) To: doc-committers@freebsd.org, svn-doc-all@freebsd.org, svn-doc-head@freebsd.org Subject: svn commit: r47974 - head/en_US.ISO8859-1/htdocs/news/status X-SVN-Group: doc-head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-doc-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire doc trees \(except for " user" , " projects" , and " translations" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jan 2016 19:23:43 -0000 Author: bjk Date: Sat Jan 9 19:23:42 2016 New Revision: 47974 URL: https://svnweb.freebsd.org/changeset/doc/47974 Log: Add report on vnode cache tuning from mckusick Modified: head/en_US.ISO8859-1/htdocs/news/status/report-2015-10-2015-12.xml Modified: head/en_US.ISO8859-1/htdocs/news/status/report-2015-10-2015-12.xml ============================================================================== --- head/en_US.ISO8859-1/htdocs/news/status/report-2015-10-2015-12.xml Sat Jan 9 19:08:52 2016 (r47973) +++ head/en_US.ISO8859-1/htdocs/news/status/report-2015-10-2015-12.xml Sat Jan 9 19:23:42 2016 (r47974) @@ -558,4 +558,112 @@ portions and committed.

+ + + Kernel Vnode Cache Tuning + + + + + Kirk + McKusick + + mckusick@mckusick.com + + + + + Bruce + Evans + + bde@FreeBSD.org + + + + + Konstantin + Belousov + + kib@FreeBSD.org + + + + + Peter + Holm + + pho@FreeBSD.org + + + + + Mateusz + Guzik + + mjg@FreeBSD.org + + + + + MFC to stable/10 + + + +

This completed project includes changes to better manage + the vnode freelist and to streamline the allocation and freeing of + vnodes.

+ +

Vnode cache recycling was reworked to meet free and unused + vnodes targets. Free vnodes are rarely completely free; rather, + they are just ones that are cheap to recycle. Usually they are + for files which have been stat'd but not read; these usually have + inode and namecache data attached to them. The free vnode target + is the preferred minimum size of a sub-cache consisting mostly of + such files. The system balances the size of this sub-cache with + its complement to try to prevent either from thrashing while the + other is relatively inactive. The targets express a preference + for the best balance.

+ +

"Above" this target there are 2 further targets + (watermarks) related to the recyling of free vnodes. In the + best-operating case, the cache is exactly full, the free list has + size between vlowat and vhiwat above the free target, and + recycling from the free list and normal use maintains this state. + Sometimes the free list is below vlowat or even empty, but this + state is even better for immediate use, provided the cache is not + full. Otherwise, vnlru_proc() runs to reclaim enough vnodes + (usually non-free ones) to reach one of these states. The + watermarks are currently hard-coded as 4% and 9% of the available + space. These, and the default of 25% for wantfreevnodes, are too + large if the memory size is large. E.g., 9% of 75% of MAXVNODES + is more than 566000 vnodes to reclaim whenever vnlru_proc() + becomes active.

+ +

The vfs.vlru_alloc_cache_src sysctl is removed. + New code frees namecache sources as the last chance to satisfy the + highest watermark, instead of selecting source vnodes randomly. + This provides good enough behaviour to keep vn_fullpath() working + in most situations. Filesystem layouts with deep trees, where the + removed knob was required, is thus handled automatically.

+ +

As the kernel allocates and frees vnodes, it fully + initializes them on every allocation and fully releases them on + every free. These are not trivial costs: it starts by zeroing a + large structure, then initializes a mutex, a lock manager lock, an + rw lock, four lists, and six pointers. Looking at + vfs.vnodes_created, these operations are being done + millions of times an hour on a busy machine.

+ +

As a performance optimization, this code update uses the + uma_init and uma_fini routines to do these initializations and + cleanups only as the vnodes enter and leave the vnode zone. With + this change, the initializations are done kern.maxvnodes + times at system startup, and then only rarely again. The frees + are done only if the vnode zone shrinks, which never happens in + practice. For those curious about the avoided work, look at the + vnode_init() and vnode_fini() functions in sys/kern/vfs_subr.c to + see the code that has been removed from the main vnode + allocation/free path.

+ +