From owner-freebsd-arch@freebsd.org Mon Aug 10 14:30:35 2015 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 49C6099EB84 for ; Mon, 10 Aug 2015 14:30:35 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 2DED08C7 for ; Mon, 10 Aug 2015 14:30:35 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: by mailman.ysv.freebsd.org (Postfix) id 2CF8199EB83; Mon, 10 Aug 2015 14:30:35 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 13B2299EB82 for ; Mon, 10 Aug 2015 14:30:35 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "cell.glebius.int.ru", Issuer "cell.glebius.int.ru" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A2A978C4; Mon, 10 Aug 2015 14:30:33 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.15.2/8.15.2) with ESMTPS id t7AEUVhW077586 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 10 Aug 2015 17:30:31 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.15.2/8.15.2/Submit) id t7AEUVtU077585; Mon, 10 Aug 2015 17:30:31 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 10 Aug 2015 17:30:31 +0300 From: Gleb Smirnoff To: Konstantin Belousov Cc: arch@FreeBSD.org, alc@freebsd.org Subject: Re: more strict KPI for vm_pager_get_pages() Message-ID: <20150810143031.GF75813@glebius.int.ru> References: <20150430142408.GS546@nginx.com> <20150807133844.GS889@FreeBSD.org> <20150808084121.GX2072@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150808084121.GX2072@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Aug 2015 14:30:35 -0000 Konstantin, On Sat, Aug 08, 2015 at 11:41:21AM +0300, Konstantin Belousov wrote: K> > 1) There could be a user of KPI who would be fine with partial success. K> > K> > My answer: right now there is none, and if one emerges, the code can K> > be easily adopted to return VM_PAGER_ERROR, but still mark validated K> > pages as valid. The user of KPI then can scan the array and take valid K> > pages. So, the patch doesn't put any obstacles on appearance of such K> > user. K> The vm_fault.c is already fine with the partial success, it only cares K> that the requested page was validated and no error from pager is K> returned. vm_fault.c can be easily adopted to partial success, but I don't see a good reason to do this, since right now no pager does return partial success. K> > 2) Filesystems can do short reads by design, and thus fail to validate K> > the entire array. K> > K> > My answer: yes, that's true. By design NFS, SMBFS and FUSE should be K> > able to return short reads. However, the VOP_GETPAGES methods for all K> > three FSes right now do not have any code that would support that. So, K> > it looks like there is an open issue with these filesystems, not related K> > to my patch. When this issue is addressed in any of aforementioned FSes, K> > the VOP_GETPAGES should be fixed to do several I/Os in case of short K> > reads. K> K> And this is a bug in the networking filesystems (most likely). Rick was K> asked about NFS, but he did not responded. You are proposing to make the K> bug a part of the interface. Yes, this is likely a bug in current filesystems. Note that the bug is already here, and it is not a part of my patch. Also note that the bug has no evidence, we only modeled it in a thought experiment. And I don't propose to make a bug part of interface. I propose that if FS maintainers find time to analyze this issue, then they should fix it, hiding internal peculiarities of their FS into its implementation, instead of escalating it up to the pager interface. K> I object against this change. It is wrong philosophically, and it K> encodes the incomplete or accidental behaviour of several filesystems at K> the interface level. I object to your objection. The current KPI is wrong philosophically. It is wrong that caller passes an array of locked entities (busied pages) and gets partially unlocked array on return. This doesn't allow to write robust code safe from race conditions without introducting workarounds. K> In fact, you are taking some rather secondary feature of the current K> interface, that the non-requested pages are made busy for the duration K> of the paging request, to the level of the fundamental property (would K> the non-mreq pages be not busied, proposed change immediately causes K> applications segfault on parallel file truncation, or makes user data K> corrupted, for example). I really don't get this paragraph. "would the non-mreq pages be not busied" -- but this isn't the case. They are busied and this is intentional. K> All this rototilling is because you do not want to code the proper FSA K> in your reworked sendfile patch. You see, a proper FSA can be built around even more brain damaged interface. Alternatively, I can just say that if the array KPI is so fragile, I will do pages one by one in a busy loop. So, yes, the "problem" is solvable without touching KPI, but the solution isn't right one. -- Totus tuus, Glebius.