Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Jul 2015 15:26:12 -0500
From:      Jason Harmening <jason.harmening@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        John Baldwin <jhb@freebsd.org>, FreeBSD Arch <freebsd-arch@freebsd.org>
Subject:   Re: RFC: New KPI for fast temporary single-page KVA mappings
Message-ID:  <CAM=8qam6kOA=7rdEhhyttwr-yKP27k9y9Zw53PY93GVBfrC8kw@mail.gmail.com>
In-Reply-To: <20150715141522.GE2404@kib.kiev.ua>
References:  <CAM=8qanB11WEWHZZfxyOT7VeL%2BOLqZ47bg=1TKp5c-W=VHNZnw@mail.gmail.com> <6021359.Zicubn765k@ralph.baldwin.cx> <20150715141522.GE2404@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Yeah, I see both this and cursors as being useful for different purposes.
Code that just needs to do a simple, quick operation on a page and doesn't
want to worry about setup/teardown and synchronization (or needs to work
under low-KVA conditions) could use pmap_quick_enter_page().  More complex
code, especially code that needs a lot of pages in-flight at the same time,
could use cursors.

As kib mentioned, kva_alloc() + pmap_qenter() seems like a ready-made MI
cursor implementation.  If you want to optimize for direct maps, you could
make MD cursor implementations that bypass those steps; that's very roughly
what physcopy* does.  But, I'm not sure if that would be worth the
trouble.  For the most part, the arches that have comprehensive direct maps
are also 64-bit arches where KVA pageframes are the most plentiful.

Would it make sense to reimplement sf_bufs as a pool of cursors?

On Wed, Jul 15, 2015 at 9:15 AM, Konstantin Belousov <kostikbel@gmail.com>
wrote:

> On Tue, Jul 14, 2015 at 11:30:23AM -0700, John Baldwin wrote:
> > On Tuesday, July 07, 2015 11:37:55 AM Jason Harmening wrote:
> > > Hi everyone,
> > >
> > > I'd like to propose a couple of new pmap functions:
> > > vm_offset_t pmap_quick_enter_page(vm_page_t m)
> > > void pmap_quick_remove_page(vm_offset_t kva)
> > >
> > > These functions will create and destroy a temporary, usually CPU-local
> > > mapping of the specified page.  Where available, they will use the
> direct
> > > map.  Otherwise, they will use a per-CPU pageframe that's allocated at
> boot.
> > >
> > > Guarantees:
> > > --Will not sleep
> > > --Will not fail
> > > --Safe to call under a non-spin lock or from an ithread
> > >
> > > Restrictions:
> > > --Not safe to call from interrupt filter or under a spin mutex on all
> arches
> > > --Mappings should be held for as little time as possible; don't do any
> > > locking or sleeping while holding a mapping
> > > --Current implementation only guarantees a single page of mapping space
> > > across all arches.  MI code should not make nested calls to
> > > pmap_quick_enter_page().
> > >
> > > My idea is that the first consumer of this would be busdma.  All
> non-iommu
> > > implementations would use this for bounce buffer copies of pages that
> don't
> > > have resident mappings.  Currently busdma uses physcopy[in|out] for
> > > unmapped buffers, which on most arches uses sf_bufs that can sleep,
> making
> > > bus_dmamap_sync() unsafe to call in a lot of cases.  busdma would also
> use
> > > this for virtually-indexed cache maintenance on arm and mips.  It
> currently
> > > ignores cache maintenance for buffers that don't have a KVA or
> resident UVA
> > > mapping, which may not be correct for buffers that don't belong to
> curproc
> > > or have cache-resident VAs on other cores.
> > >
> > > I've created 2 Differential reviews:
> > > https://reviews.freebsd.org/D3013: the implementation
> > > https://reviews.freebsd.org/D3014: the kmod I've been using to test it
> > >
> > > I'd like any and all feedback, both on the general approach and the
> > > implementation details.  Some things to note on the implementation:
> > > --I've intentionally avoided touching existing pmap code for the time
> > > being.  Some of the new code could likely be shared with other pmap
> KPIs in
> > > a lot of cases.
> > > --I've structured the KPI to make it easy to extend to guarantee more
> than
> > > one per-CPU page in the future.  I could see that being useful for
> copying
> > > between pages, for example
> > > --There's no immediate consumer for the sparc64 implementation, since
> > > busdma there needs neither bounce buffers nor cache maintenance.
> > > --I would very much like feedback and testing from experts on non-x86
> > > arches.  I only have hardware to test the i386 and amd64
> implementations;
> > > I've only cross-compiled it for everything else.  Some of the non-x86
> > > details, like the Book E powerpc TLB invalidation code, are a bit
> scary and
> > > probably not quite right.
> >
> > I do think something like this would be useful.  What I had wanted to do
> was
> > to add a 'memory cursor' to go along with memory descriptors.  The idea
> would
> > be that you can use a cursor to iterate over any descriptor, and that
> one of
> > the options when creating a virtual address cursor was to ask it to
> preallocate
> > any resources it needs at creation time (e.g. a page of KVA on platforms
> without
> > a direct map).   Then if a driver or GEOM module needs to walk over
> arbitrary
> > I/O buffers that come down via virtual addresses, it could allocate one
> or more
> > cursors.
> >
> > I have a partial implementation of cursors in a p4 branch, but it of
> course is
> > missing the hard part of VA mappings without a direct map.  However,
> this would
> > let you have N of these things and to also control the lifecycle of the
> temporary
> > KVA addresses instead of having a fixed set.
> >
>
> I do not quite agree that the proposed KPI and your description of cursors
> have much in common.
>
> From what I read above, the implementation of the temporal VA mappings
> for cursors should be easy. You need to allocate VA at the time of
> cursor initialization, and then do pmap_qenter() when needed. In fact,
> it would be not trivial for the direct map case, to optimize out
> unneeded VA allocation and qenter.
>
> The proposed KPI has rather different goals, it does not need any
> pre-setup for use, but still it can be used and guarantees to not fail
> from the hard contexts, like swi or interrupt threads (in particular, it
> works in the busdma callback context).
>
> My opinion is that the KPI and cursors are for different goals. Might
> be, the KPI could be used as a building foundation for some cursor'
> functionality.
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM=8qam6kOA=7rdEhhyttwr-yKP27k9y9Zw53PY93GVBfrC8kw>