From owner-svn-src-head@freebsd.org Wed Mar 28 23:05:49 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4B13AF523B9; Wed, 28 Mar 2018 23:05:49 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EE18E68184; Wed, 28 Mar 2018 23:05:48 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E25151174F; Wed, 28 Mar 2018 23:05:48 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w2SN5mPO061276; Wed, 28 Mar 2018 23:05:48 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w2SN5mKn061275; Wed, 28 Mar 2018 23:05:48 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201803282305.w2SN5mKn061275@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 28 Mar 2018 23:05:48 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r331711 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: head X-SVN-Commit-Author: mav X-SVN-Commit-Paths: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Commit-Revision: 331711 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Mar 2018 23:05:49 -0000 Author: mav Date: Wed Mar 28 23:05:48 2018 New Revision: 331711 URL: https://svnweb.freebsd.org/changeset/base/331711 Log: MFV 331710: 9188 increase size of dbuf cache to reduce indirect block decompression illumos/illumos-gate@268bbb2a2fa79c36d4695d13a595ba50a7754b76 With compressed ARC (6950) we use up to 25% of our CPU to decompress indirect blocks, under a workload of random cached reads. To reduce this decompression cost, we would like to increase the size of the dbuf cache so that more indirect blocks can be stored uncompressed. If we are caching entire large files of recordsize=8K, the indirect blocks use 1/64th as much memory as the data blocks (assuming they have the same compression ratio). We suggest making the dbuf cache be 1/32nd of all memory, so that in this scenario we should be able to keep all the indirect blocks decompressed in the dbuf cache. (We want it to be more than the 1/64th that the indirect blocks would use because we need to cache other stuff in the dbuf cache as well.) In real world workloads, this won't help as dramatically as the example above, but we think it's still worth it because the risk of decreasing performance is low. The potential negative performance impact is that we will be slightly reducing the size of the ARC (by ~3%). Reviewed by: Dan Kimmel Reviewed by: Prashanth Sreenivasa Reviewed by: Paul Dagnelie Reviewed by: Sanjay Nadkarni Reviewed by: Allan Jude Reviewed by: Igor Kozhukhov Approved by: Garrett D'Amore Author: George Wilson Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Directory Properties: head/sys/cddl/contrib/opensolaris/ (props changed) Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Mar 28 22:57:02 2018 (r331710) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Wed Mar 28 23:05:48 2018 (r331711) @@ -85,10 +85,10 @@ static boolean_t dbuf_evict_thread_exit; */ static multilist_t *dbuf_cache; static refcount_t dbuf_cache_size; -uint64_t dbuf_cache_max_bytes = 100 * 1024 * 1024; +uint64_t dbuf_cache_max_bytes = 0; -/* Cap the size of the dbuf cache to log2 fraction of arc size. */ -int dbuf_cache_max_shift = 5; +/* Set the default size of the dbuf cache to log2 fraction of arc size. */ +int dbuf_cache_shift = 5; /* * The dbuf cache uses a three-stage eviction policy: @@ -138,8 +138,8 @@ uint_t dbuf_cache_lowater_pct = 10; SYSCTL_DECL(_vfs_zfs); SYSCTL_QUAD(_vfs_zfs, OID_AUTO, dbuf_cache_max_bytes, CTLFLAG_RWTUN, &dbuf_cache_max_bytes, 0, "dbuf cache size in bytes"); -SYSCTL_INT(_vfs_zfs, OID_AUTO, dbuf_cache_max_shift, CTLFLAG_RDTUN, - &dbuf_cache_max_shift, 0, "dbuf size as log2 fraction of ARC"); +SYSCTL_INT(_vfs_zfs, OID_AUTO, dbuf_cache_shift, CTLFLAG_RDTUN, + &dbuf_cache_shift, 0, "dbuf cache size as log2 fraction of ARC"); SYSCTL_UINT(_vfs_zfs, OID_AUTO, dbuf_cache_hiwater_pct, CTLFLAG_RWTUN, &dbuf_cache_hiwater_pct, 0, "max percents above the dbuf cache size"); SYSCTL_UINT(_vfs_zfs, OID_AUTO, dbuf_cache_lowater_pct, CTLFLAG_RWTUN, @@ -610,11 +610,15 @@ retry: mutex_init(&h->hash_mutexes[i], NULL, MUTEX_DEFAULT, NULL); /* - * Setup the parameters for the dbuf cache. We cap the size of the - * dbuf cache to 1/32nd (default) of the size of the ARC. + * Setup the parameters for the dbuf cache. We set the size of the + * dbuf cache to 1/32nd (default) of the size of the ARC. If the value + * has been set in /etc/system and it's not greater than the size of + * the ARC, then we honor that value. */ - dbuf_cache_max_bytes = MIN(dbuf_cache_max_bytes, - arc_max_bytes() >> dbuf_cache_max_shift); + if (dbuf_cache_max_bytes == 0 || + dbuf_cache_max_bytes >= arc_max_bytes()) { + dbuf_cache_max_bytes = arc_max_bytes() >> dbuf_cache_shift; + } /* * All entries are queued via taskq_dispatch_ent(), so min/maxalloc