From owner-svn-src-head@freebsd.org Tue Nov 1 12:53:16 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E2402C23DC4; Tue, 1 Nov 2016 12:53:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 67D4715D6; Tue, 1 Nov 2016 12:53:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id uA1CrBQ6066380 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 1 Nov 2016 14:53:11 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua uA1CrBQ6066380 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id uA1CrAkl066379; Tue, 1 Nov 2016 14:53:10 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 1 Nov 2016 14:53:10 +0200 From: Konstantin Belousov To: Gleb Smirnoff Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r308026 - in head/sys: kern sys ufs/ffs Message-ID: <20161101125310.GD54029@kib.kiev.ua> References: <201610281143.u9SBhxrN008547@repo.freebsd.org> <20161101000246.GQ27748@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161101000246.GQ27748@FreeBSD.org> User-Agent: Mutt/1.7.1 (2016-10-04) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Nov 2016 12:53:17 -0000 On Mon, Oct 31, 2016 at 05:02:46PM -0700, Gleb Smirnoff wrote: > Hi, > > On Fri, Oct 28, 2016 at 11:43:59AM +0000, Konstantin Belousov wrote: > K> Author: kib > K> Date: Fri Oct 28 11:43:59 2016 > K> New Revision: 308026 > K> URL: https://svnweb.freebsd.org/changeset/base/308026 > K> > K> Log: > K> Generalize UFS buffer pager to allow it serving other filesystems > K> which also use buffer cache. > K> > K> Most important addition to the code is the handling of filesystems > K> where the block size is less than the machine page size, which might > K> require reading several buffers to validate single page. > K> > K> Tested by: pho > K> Sponsored by: The FreeBSD Foundation > K> MFC after: 2 weeks > K> > K> Modified: > K> head/sys/kern/vfs_bio.c > K> head/sys/sys/buf.h > K> head/sys/ufs/ffs/ffs_vnops.c > K> > K> Modified: head/sys/kern/vfs_bio.c > K> ============================================================================== > K> --- head/sys/kern/vfs_bio.c Fri Oct 28 11:35:06 2016 (r308025) > K> +++ head/sys/kern/vfs_bio.c Fri Oct 28 11:43:59 2016 (r308026) > K> @@ -75,9 +75,10 @@ __FBSDID("$FreeBSD$"); > K> #include > K> #include > K> #include > K> -#include > K> -#include > K> #include > K> +#include > K> +#include > K> +#include > K> #include > K> #include > K> #include > K> @@ -4636,6 +4637,161 @@ bdata2bio(struct buf *bp, struct bio *bi > K> } > K> } > K> > K> +static int buf_pager_relbuf; > K> +SYSCTL_INT(_vfs, OID_AUTO, buf_pager_relbuf, CTLFLAG_RWTUN, > K> + &buf_pager_relbuf, 0, > K> + "Make buffer pager release buffers after reading"); > K> + > K> +/* > K> + * The buffer pager. It uses buffer reads to validate pages. > K> + * > K> + * In contrast to the generic local pager from vm/vnode_pager.c, this > K> + * pager correctly and easily handles volumes where the underlying > K> + * device block size is greater than the machine page size. The > K> + * buffer cache transparently extends the requested page run to be > K> + * aligned at the block boundary, and does the necessary bogus page > K> + * replacements in the addends to avoid obliterating already valid > K> + * pages. > K> + * > K> + * The only non-trivial issue is that the exclusive busy state for > K> + * pages, which is assumed by the vm_pager_getpages() interface, is > K> + * incompatible with the VMIO buffer cache's desire to share-busy the > K> + * pages. This function performs a trivial downgrade of the pages' > K> + * state before reading buffers, and a less trivial upgrade from the > K> + * shared-busy to excl-busy state after the read. > > IMHO, should be noted that the pager ignores requested rbehind and rahead > values, and does the rbehind and rahead sizes that he prefers. Pager interface considers the ahead/behind pages' page-in as unsignificant, in particular because the pages can be recycled or invalidated during the pager operation, when pager drops the object lock. More important, this pager de-facto uses the optimal filesystem-depended aligned io size due to its structure, comparing with the bmap pager. For this reason, I consider additional attempts to follow optional upper-level hints not very useful. Measurements show no difference in the real workload times, and marginal improvements for microbenchmarks (5% scale). I might do something more aggressive when upper-level specified rahead is (significantly) above the natural block size limit, like using breadn() instead of bread(). Practice suggests that this would not help or even be a pessimisation due to higher buf cache trashing.