Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Apr 2015 12:41:52 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Jason Harmening <jason.harmening@gmail.com>
Cc:        Svatopluk Kraus <onwahe@gmail.com>, FreeBSD Arch <freebsd-arch@freebsd.org>
Subject:   Re: bus_dmamap_sync() for bounced client buffers from user address space
Message-ID:  <20150425094152.GE2390@kib.kiev.ua>
In-Reply-To: <CAM=8qan-4SbKJaddrfkv=HG3n%2BHaOPDL5MEPS9DoaTvnhrJPZQ@mail.gmail.com>
References:  <CAFHCsPXMjge84AR2cR8KXMXWP4kH2YvuV_uqtPKUvn5C3ygknw@mail.gmail.com> <CAM=8qan-4SbKJaddrfkv=HG3n%2BHaOPDL5MEPS9DoaTvnhrJPZQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 24, 2015 at 05:50:15PM -0500, Jason Harmening wrote:
> A couple of comments:
> 
> --POSTWRITE and POSTREAD are only asynchronous if you call them from an
> asynchronous context.
> For a driver that's already performing DMA operations on usermode memory,
> it seems likely that there's going to be *some* place where you can call
> bus_dmamap_sync() and be guaranteed to be executing in the context of the
> process that owns the memory.  Then a regular bcopy will be safe and
> inexpensive, assuming the pages have been properly vslock-ed/vm_map_wire-d.
> That's usually whatever read/write/ioctl operation spawned the DMA transfer
> in the first place.  So, in those cases can you not just move the
> POSTREAD/POSTWRITE sync from the "DMA-finished" interrupt to the
> d_read/d_write/d_ioctl that waits on the "DMA-finished" interrupt?
> 
> --physcopyin/physcopyout aren't trivial.  They go through uiomove_fromphys,
> which often uses sfbufs to create temporary KVA mappings for the physical
> pages.  sf_buf_alloc() can sleep (unless SFB_NOWAIT is specified, which
> means it can fail and which uiomove_fromphys does not specify for good
> reason); that makes it unsafe for use in either a threaded interrupt or a
> filter.  Perhaps the physcopyout path could be changed to use pmap_qenter
> directly in this case, but that can still be expensive in terms of TLB
> shootdowns.
> 
> Checking against VM_MIN_KERNEL_ADDRESS seems sketchy; it eliminates the
> chance to use a much-less-expensive bcopy in cases where the sync is
> happening in correct process context.
> 
> Context-switching during bus_dmamap_sync() shouldn't be an issue.  In a
> filter interrupt, curproc will be completely arbitrary but none of this
> stuff should be called in a filter anyway.  Otherwise, if you're in a
> kernel thread (including an ithread), curproc should be whatever proc was
> supplied when the thread was created.  That's usually proc0, which only has
> kernel address space.  IOW, even if a context-switch happens sometime
> during bus_dmamap_sync, any pmap check or copy should have a consistent and
> non-arbitrary process context.
> 
> I think something like your second solution would be workable to make
> UIO_USERSPACE syncs work in non-interrupt kernel threads, but given all the
> restrictions and extra cost of physcopy, I'm not sure how useful that would
> be.
> 
> I do think busdma.9 could at least use a note that bus_dmamap_sync() is
> only safe to call in the context of the owning process for user buffers.

UIO_USERSPACE for busdma is absolutely unsafe and cannot be used without
making kernel panicing.  Even if you wire the userspace bufer, there is
nothing which could prevent other thread in the user process, or other
process sharing the same address space, to call munmap(2) on the range.

The only safe method to work with the userspace regions is to
vm_fault_quick_hold() them to get hold on the pages, and then either
pass pages array down, or remap them in the KVA with pmap_qenter().

> 
> 
> On Fri, Apr 24, 2015 at 8:13 AM, Svatopluk Kraus <onwahe@gmail.com> wrote:
> 
> > DMA can be done on client buffer from user address space. For example,
> > thru bus_dmamap_load_uio() when uio->uio_segflg is UIO_USERSPACE. Such
> > client buffer can bounce and then, it must be copied to and from
> > bounce buffer in bus_dmamap_sync().
> >
> > Current implementations (in all archs) do not take into account that
> > bus_dmamap_sync() is asynchronous for POSTWRITE and POSTREAD in
> > general. It can be asynchronous for PREWRITE and PREREAD too. For
> > example, in some driver implementations where DMA client buffers
> > operations are buffered. In those cases, simple bcopy() is not
> > correct.
> >
> > Demonstration of current implementation (x86) is the following:
> >
> > -----------------------------
> > struct bounce_page {
> >     vm_offset_t vaddr;      /* kva of bounce buffer */
> >     bus_addr_t  busaddr;    /* Physical address */
> >     vm_offset_t datavaddr;  /* kva of client data */
> >     bus_addr_t  dataaddr;   /* client physical address */
> >     bus_size_t  datacount;  /* client data count */
> >     STAILQ_ENTRY(bounce_page) links;
> > };
> >
> >
> > if ((op & BUS_DMASYNC_PREWRITE) != 0) {
> >     while (bpage != NULL) {
> >         if (bpage->datavaddr != 0) {
> >             bcopy((void *)bpage->datavaddr,
> >                 (void *)bpage->vaddr,
> >                 bpage->datacount);
> >         } else {
> >             physcopyout(bpage->dataaddr,
> >                 (void *)bpage->vaddr,
> >                 bpage->datacount);
> >         }
> >         bpage = STAILQ_NEXT(bpage, links);
> >     }
> >     dmat->bounce_zone->total_bounced++;
> > }
> > -----------------------------
> >
> > There are two things:
> >
> > (1) datavaddr is not always kva of client data, but sometimes it can
> > be uva of client data.
> > (2) bcopy() can be used only if datavaddr is kva or when map->pmap is
> > current pmap.
> >
> > Note that there is an implication for bus_dmamap_load_uio() with
> > uio->uio_segflg set to UIO_USERSPACE that used physical pages are
> > in-core and wired. See "man bus_dma".
> >
> > There is not public interface to check that map->pmap is current pmap.
> > So one solution is the following:
> >
> > if (bpage->datavaddr >= VM_MIN_KERNEL_ADDRESS) {
> >         bcopy();
> > } else {
> >         physcopy();
> > }
> >
> > If there will be public pmap_is_current() then another solution is the
> > following:
> >
> > if (bpage->datavaddr != 0) && pmap_is_current(map->pmap)) {
> >         bcopy();
> > } else {
> >         physcopy();
> > }
> >
> > The second solution implies that context switch must not happen during
> > bus_dmamap_sync() called from an interrupt routine. However, IMO, it's
> > granted.
> >
> > Note that map->pmap should be always kernel_pmap for datavaddr >=
> > VM_MIN_KERNEL_ADDRESS.
> >
> > Comments, different solutions, or objections?
> >
> > Svatopluk Kraus
> > _______________________________________________
> > freebsd-arch@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
> >
> _______________________________________________
> freebsd-arch@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150425094152.GE2390>