From owner-freebsd-arch@freebsd.org Wed Jul 15 20:26:15 2015 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 991CE9A212D for ; Wed, 15 Jul 2015 20:26:15 +0000 (UTC) (envelope-from jason.harmening@gmail.com) Received: from mail-lb0-x22e.google.com (mail-lb0-x22e.google.com [IPv6:2a00:1450:4010:c04::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0312B10F7; Wed, 15 Jul 2015 20:26:15 +0000 (UTC) (envelope-from jason.harmening@gmail.com) Received: by lbbpo10 with SMTP id po10so31786357lbb.3; Wed, 15 Jul 2015 13:26:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=iLQU0Nq/0i6NW9km73pLJbaIEg66fmNqS4W/k99N3L4=; b=pIPuDu9UI7KevANzjiyCz/wbiVMweYNL7EUvNnckFfDB8zbmcddOmY4gvy6AjqpIYK Gk1esjUsUVh0gY7OBiuBkw0Fd6xUaPIvxfO5MwY+4jdRcugDpgC/OX4QFyx23/rBLGFP t9ILh4tsEdl1Qe60UoaT3wKcmz5ZhYZSZhObZH3rCOat4Q9TxkQ0UbNVfFJqAdTduhgl 7TYlJH1d8wKPUL2XYrL0AHue+oOOT2ejjflQWeZdJ/Hx6T6Ts9337WJKgjeK0jCZHZxi NcdBqK8depfjPEl7yeLCfXz+qJ+256GhUmw5Lhmw9voxs1mMzvl5OtecrybsSuY9naJ/ gFvg== MIME-Version: 1.0 X-Received: by 10.152.6.1 with SMTP id w1mr5866152law.91.1436991972985; Wed, 15 Jul 2015 13:26:12 -0700 (PDT) Received: by 10.112.42.162 with HTTP; Wed, 15 Jul 2015 13:26:12 -0700 (PDT) In-Reply-To: <20150715141522.GE2404@kib.kiev.ua> References: <6021359.Zicubn765k@ralph.baldwin.cx> <20150715141522.GE2404@kib.kiev.ua> Date: Wed, 15 Jul 2015 15:26:12 -0500 Message-ID: Subject: Re: RFC: New KPI for fast temporary single-page KVA mappings From: Jason Harmening To: Konstantin Belousov Cc: John Baldwin , FreeBSD Arch Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Jul 2015 20:26:15 -0000 Yeah, I see both this and cursors as being useful for different purposes. Code that just needs to do a simple, quick operation on a page and doesn't want to worry about setup/teardown and synchronization (or needs to work under low-KVA conditions) could use pmap_quick_enter_page(). More complex code, especially code that needs a lot of pages in-flight at the same time, could use cursors. As kib mentioned, kva_alloc() + pmap_qenter() seems like a ready-made MI cursor implementation. If you want to optimize for direct maps, you could make MD cursor implementations that bypass those steps; that's very roughly what physcopy* does. But, I'm not sure if that would be worth the trouble. For the most part, the arches that have comprehensive direct maps are also 64-bit arches where KVA pageframes are the most plentiful. Would it make sense to reimplement sf_bufs as a pool of cursors? On Wed, Jul 15, 2015 at 9:15 AM, Konstantin Belousov wrote: > On Tue, Jul 14, 2015 at 11:30:23AM -0700, John Baldwin wrote: > > On Tuesday, July 07, 2015 11:37:55 AM Jason Harmening wrote: > > > Hi everyone, > > > > > > I'd like to propose a couple of new pmap functions: > > > vm_offset_t pmap_quick_enter_page(vm_page_t m) > > > void pmap_quick_remove_page(vm_offset_t kva) > > > > > > These functions will create and destroy a temporary, usually CPU-local > > > mapping of the specified page. Where available, they will use the > direct > > > map. Otherwise, they will use a per-CPU pageframe that's allocated at > boot. > > > > > > Guarantees: > > > --Will not sleep > > > --Will not fail > > > --Safe to call under a non-spin lock or from an ithread > > > > > > Restrictions: > > > --Not safe to call from interrupt filter or under a spin mutex on all > arches > > > --Mappings should be held for as little time as possible; don't do any > > > locking or sleeping while holding a mapping > > > --Current implementation only guarantees a single page of mapping space > > > across all arches. MI code should not make nested calls to > > > pmap_quick_enter_page(). > > > > > > My idea is that the first consumer of this would be busdma. All > non-iommu > > > implementations would use this for bounce buffer copies of pages that > don't > > > have resident mappings. Currently busdma uses physcopy[in|out] for > > > unmapped buffers, which on most arches uses sf_bufs that can sleep, > making > > > bus_dmamap_sync() unsafe to call in a lot of cases. busdma would also > use > > > this for virtually-indexed cache maintenance on arm and mips. It > currently > > > ignores cache maintenance for buffers that don't have a KVA or > resident UVA > > > mapping, which may not be correct for buffers that don't belong to > curproc > > > or have cache-resident VAs on other cores. > > > > > > I've created 2 Differential reviews: > > > https://reviews.freebsd.org/D3013: the implementation > > > https://reviews.freebsd.org/D3014: the kmod I've been using to test it > > > > > > I'd like any and all feedback, both on the general approach and the > > > implementation details. Some things to note on the implementation: > > > --I've intentionally avoided touching existing pmap code for the time > > > being. Some of the new code could likely be shared with other pmap > KPIs in > > > a lot of cases. > > > --I've structured the KPI to make it easy to extend to guarantee more > than > > > one per-CPU page in the future. I could see that being useful for > copying > > > between pages, for example > > > --There's no immediate consumer for the sparc64 implementation, since > > > busdma there needs neither bounce buffers nor cache maintenance. > > > --I would very much like feedback and testing from experts on non-x86 > > > arches. I only have hardware to test the i386 and amd64 > implementations; > > > I've only cross-compiled it for everything else. Some of the non-x86 > > > details, like the Book E powerpc TLB invalidation code, are a bit > scary and > > > probably not quite right. > > > > I do think something like this would be useful. What I had wanted to do > was > > to add a 'memory cursor' to go along with memory descriptors. The idea > would > > be that you can use a cursor to iterate over any descriptor, and that > one of > > the options when creating a virtual address cursor was to ask it to > preallocate > > any resources it needs at creation time (e.g. a page of KVA on platforms > without > > a direct map). Then if a driver or GEOM module needs to walk over > arbitrary > > I/O buffers that come down via virtual addresses, it could allocate one > or more > > cursors. > > > > I have a partial implementation of cursors in a p4 branch, but it of > course is > > missing the hard part of VA mappings without a direct map. However, > this would > > let you have N of these things and to also control the lifecycle of the > temporary > > KVA addresses instead of having a fixed set. > > > > I do not quite agree that the proposed KPI and your description of cursors > have much in common. > > From what I read above, the implementation of the temporal VA mappings > for cursors should be easy. You need to allocate VA at the time of > cursor initialization, and then do pmap_qenter() when needed. In fact, > it would be not trivial for the direct map case, to optimize out > unneeded VA allocation and qenter. > > The proposed KPI has rather different goals, it does not need any > pre-setup for use, but still it can be used and guarantees to not fail > from the hard contexts, like swi or interrupt threads (in particular, it > works in the busdma callback context). > > My opinion is that the KPI and cursors are for different goals. Might > be, the KPI could be used as a building foundation for some cursor' > functionality. >