Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Apr 2015 22:00:32 +0200
From:      Svatopluk Kraus <onwahe@gmail.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Jason Harmening <jason.harmening@gmail.com>, FreeBSD Arch <freebsd-arch@freebsd.org>
Subject:   Re: bus_dmamap_sync() for bounced client buffers from user address space
Message-ID:  <CAFHCsPXzU_Cu51bj8vc_5e7to4GqdFDUvHvwWM1mAxJ5LvemXw@mail.gmail.com>
In-Reply-To: <20150425094152.GE2390@kib.kiev.ua>
References:  <CAFHCsPXMjge84AR2cR8KXMXWP4kH2YvuV_uqtPKUvn5C3ygknw@mail.gmail.com> <CAM=8qan-4SbKJaddrfkv=HG3n%2BHaOPDL5MEPS9DoaTvnhrJPZQ@mail.gmail.com> <20150425094152.GE2390@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Apr 25, 2015 at 11:41 AM, Konstantin Belousov
<kostikbel@gmail.com> wrote:
> On Fri, Apr 24, 2015 at 05:50:15PM -0500, Jason Harmening wrote:
>> A couple of comments:
>>
>> --POSTWRITE and POSTREAD are only asynchronous if you call them from an
>> asynchronous context.
>> For a driver that's already performing DMA operations on usermode memory,
>> it seems likely that there's going to be *some* place where you can call
>> bus_dmamap_sync() and be guaranteed to be executing in the context of the
>> process that owns the memory.  Then a regular bcopy will be safe and
>> inexpensive, assuming the pages have been properly vslock-ed/vm_map_wire-d.
>> That's usually whatever read/write/ioctl operation spawned the DMA transfer
>> in the first place.  So, in those cases can you not just move the
>> POSTREAD/POSTWRITE sync from the "DMA-finished" interrupt to the
>> d_read/d_write/d_ioctl that waits on the "DMA-finished" interrupt?
>>
>> --physcopyin/physcopyout aren't trivial.  They go through uiomove_fromphys,
>> which often uses sfbufs to create temporary KVA mappings for the physical
>> pages.  sf_buf_alloc() can sleep (unless SFB_NOWAIT is specified, which
>> means it can fail and which uiomove_fromphys does not specify for good
>> reason); that makes it unsafe for use in either a threaded interrupt or a
>> filter.  Perhaps the physcopyout path could be changed to use pmap_qenter
>> directly in this case, but that can still be expensive in terms of TLB
>> shootdowns.
>>
>> Checking against VM_MIN_KERNEL_ADDRESS seems sketchy; it eliminates the
>> chance to use a much-less-expensive bcopy in cases where the sync is
>> happening in correct process context.
>>
>> Context-switching during bus_dmamap_sync() shouldn't be an issue.  In a
>> filter interrupt, curproc will be completely arbitrary but none of this
>> stuff should be called in a filter anyway.  Otherwise, if you're in a
>> kernel thread (including an ithread), curproc should be whatever proc was
>> supplied when the thread was created.  That's usually proc0, which only has
>> kernel address space.  IOW, even if a context-switch happens sometime
>> during bus_dmamap_sync, any pmap check or copy should have a consistent and
>> non-arbitrary process context.
>>
>> I think something like your second solution would be workable to make
>> UIO_USERSPACE syncs work in non-interrupt kernel threads, but given all the
>> restrictions and extra cost of physcopy, I'm not sure how useful that would
>> be.
>>
>> I do think busdma.9 could at least use a note that bus_dmamap_sync() is
>> only safe to call in the context of the owning process for user buffers.
>
> UIO_USERSPACE for busdma is absolutely unsafe and cannot be used without
> making kernel panicing.  Even if you wire the userspace bufer, there is
> nothing which could prevent other thread in the user process, or other
> process sharing the same address space, to call munmap(2) on the range.
>


Using of vslock() is proposed method in bus_dma man page. IMO, the
function looks complex and can be a big time eater. However, are you
saying that vslock() does not work for that? Then for what reason does
that function exist?


> The only safe method to work with the userspace regions is to
> vm_fault_quick_hold() them to get hold on the pages, and then either
> pass pages array down, or remap them in the KVA with pmap_qenter().
>


So, even vm_fault_quick_hold() does not keep valid user mapping?


>>
>>
>> On Fri, Apr 24, 2015 at 8:13 AM, Svatopluk Kraus <onwahe@gmail.com> wrote:
>>
>> > DMA can be done on client buffer from user address space. For example,
>> > thru bus_dmamap_load_uio() when uio->uio_segflg is UIO_USERSPACE. Such
>> > client buffer can bounce and then, it must be copied to and from
>> > bounce buffer in bus_dmamap_sync().
>> >
>> > Current implementations (in all archs) do not take into account that
>> > bus_dmamap_sync() is asynchronous for POSTWRITE and POSTREAD in
>> > general. It can be asynchronous for PREWRITE and PREREAD too. For
>> > example, in some driver implementations where DMA client buffers
>> > operations are buffered. In those cases, simple bcopy() is not
>> > correct.
>> >
>> > Demonstration of current implementation (x86) is the following:
>> >
>> > -----------------------------
>> > struct bounce_page {
>> >     vm_offset_t vaddr;      /* kva of bounce buffer */
>> >     bus_addr_t  busaddr;    /* Physical address */
>> >     vm_offset_t datavaddr;  /* kva of client data */
>> >     bus_addr_t  dataaddr;   /* client physical address */
>> >     bus_size_t  datacount;  /* client data count */
>> >     STAILQ_ENTRY(bounce_page) links;
>> > };
>> >
>> >
>> > if ((op & BUS_DMASYNC_PREWRITE) != 0) {
>> >     while (bpage != NULL) {
>> >         if (bpage->datavaddr != 0) {
>> >             bcopy((void *)bpage->datavaddr,
>> >                 (void *)bpage->vaddr,
>> >                 bpage->datacount);
>> >         } else {
>> >             physcopyout(bpage->dataaddr,
>> >                 (void *)bpage->vaddr,
>> >                 bpage->datacount);
>> >         }
>> >         bpage = STAILQ_NEXT(bpage, links);
>> >     }
>> >     dmat->bounce_zone->total_bounced++;
>> > }
>> > -----------------------------
>> >
>> > There are two things:
>> >
>> > (1) datavaddr is not always kva of client data, but sometimes it can
>> > be uva of client data.
>> > (2) bcopy() can be used only if datavaddr is kva or when map->pmap is
>> > current pmap.
>> >
>> > Note that there is an implication for bus_dmamap_load_uio() with
>> > uio->uio_segflg set to UIO_USERSPACE that used physical pages are
>> > in-core and wired. See "man bus_dma".
>> >
>> > There is not public interface to check that map->pmap is current pmap.
>> > So one solution is the following:
>> >
>> > if (bpage->datavaddr >= VM_MIN_KERNEL_ADDRESS) {
>> >         bcopy();
>> > } else {
>> >         physcopy();
>> > }
>> >
>> > If there will be public pmap_is_current() then another solution is the
>> > following:
>> >
>> > if (bpage->datavaddr != 0) && pmap_is_current(map->pmap)) {
>> >         bcopy();
>> > } else {
>> >         physcopy();
>> > }
>> >
>> > The second solution implies that context switch must not happen during
>> > bus_dmamap_sync() called from an interrupt routine. However, IMO, it's
>> > granted.
>> >
>> > Note that map->pmap should be always kernel_pmap for datavaddr >=
>> > VM_MIN_KERNEL_ADDRESS.
>> >
>> > Comments, different solutions, or objections?
>> >
>> > Svatopluk Kraus
>> > _______________________________________________
>> > freebsd-arch@freebsd.org mailing list
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-arch
>> > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
>> >
>> _______________________________________________
>> freebsd-arch@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFHCsPXzU_Cu51bj8vc_5e7to4GqdFDUvHvwWM1mAxJ5LvemXw>