From owner-freebsd-arch@freebsd.org Mon Dec 14 17:50:57 2015 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CF090A4380E for ; Mon, 14 Dec 2015 17:50:57 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B3BB9126B for ; Mon, 14 Dec 2015 17:50:57 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: by mailman.ysv.freebsd.org (Postfix) id B133CA4380C; Mon, 14 Dec 2015 17:50:57 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B0C7EA4380B for ; Mon, 14 Dec 2015 17:50:57 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "cell.glebius.int.ru", Issuer "cell.glebius.int.ru" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 21765126A; Mon, 14 Dec 2015 17:50:55 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.15.2/8.15.2) with ESMTPS id tBEHoklL024197 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 14 Dec 2015 20:50:46 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.15.2/8.15.2/Submit) id tBEHokjB024196; Mon, 14 Dec 2015 20:50:46 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 14 Dec 2015 20:50:46 +0300 From: Gleb Smirnoff To: Konstantin Belousov Cc: jeff@FreeBSD.org, alc@FreeBSD.org, scottl@FreeBSD.org, pho@FreeBSD.org, arch@FreeBSD.org Subject: Re: new vm_pager_get_pages() KPI, round 3 Message-ID: <20151214175046.GR78497@FreeBSD.org> References: <20151205052940.GJ42565@FreeBSD.org> <20151214111335.GB82577@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151214111335.GB82577@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2015 17:50:58 -0000 On Mon, Dec 14, 2015 at 01:13:35PM +0200, Konstantin Belousov wrote: K> I fail to understand how the case of count > 1 and non-contiguous blocks K> in the non-readahead case is handled by new vnode_pager_generic_getpages(). K> K> I do not understand how a hole somewhere in the requested range is handled. K> Code has a comment that a hole must not appear in the range. K> K> Both issues mean that vm_pager_has_page() still must be called before K> pagein, for count > 1 use. E.g. the exec_map_first_page() uses *after K> value returned from has_pages() to calculate count, which is an advisory K> and not the contract. Same issue prevents converting GEM and TTM (and K> probably md) to use the count > 1 KPI. The *after and *before are now not advisory, but a contract. Those consumers, who want to utilize count > 1, must preceed the call to vm_pager_get_pages() with call to vm_pager_haspage(). Only region approved by vm_pager_haspage() or smaller will succeed. K> Same is true for swap pager, and this prevents the removal of the loop K> in vm_thread_swaping(). K> K> Code assumes that the partially valid page may only appear in the last K> position of the page run for the local pager, which again requires K> pre-validation of the vm_pager_get_pages() on the caller side. Yes, asking for page in into a valid page is a risk of data corruption. K> Overall this is not an KPI that was discussed. It seemingly does not K> change semantic for count == 1 case, but is not what it should be for K> count > 1. As discussed, new vm_pager_get_pages() was support to just K> work for any count, doing the loop over the non-contig ranges or short K> reads, and guaranteeing that all existing (or hole-filled) pages are K> read until EOF is met. This KPI was supposed to: K> - fix my compaints about short reads I will not take your complaints about short reads. The get pages KPI is not a complement to VOP_READ(), neither of a read(2) syscall. If a underlying filesystem has problems in it, it must deal with these problems on its own, doing multiple I/Os per VOP_GETPAGES(). K> - avoid excessive VOP_BMAP() call from has_pages before get_pages() It is now avoided for count == 1. K> - allowed to remove the loops from all current get_pages() consumers, K> it vm_thread_swapin(), GEM/TTM, image activator This wasn't discussed at all. I like this idea, that can be done later. -- Totus tuus, Glebius.