Date: Sun, 26 Apr 2015 22:00:32 +0200 From: Svatopluk Kraus <onwahe@gmail.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: Jason Harmening <jason.harmening@gmail.com>, FreeBSD Arch <freebsd-arch@freebsd.org> Subject: Re: bus_dmamap_sync() for bounced client buffers from user address space Message-ID: <CAFHCsPXzU_Cu51bj8vc_5e7to4GqdFDUvHvwWM1mAxJ5LvemXw@mail.gmail.com> In-Reply-To: <20150425094152.GE2390@kib.kiev.ua> References: <CAFHCsPXMjge84AR2cR8KXMXWP4kH2YvuV_uqtPKUvn5C3ygknw@mail.gmail.com> <CAM=8qan-4SbKJaddrfkv=HG3n%2BHaOPDL5MEPS9DoaTvnhrJPZQ@mail.gmail.com> <20150425094152.GE2390@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Apr 25, 2015 at 11:41 AM, Konstantin Belousov <kostikbel@gmail.com> wrote: > On Fri, Apr 24, 2015 at 05:50:15PM -0500, Jason Harmening wrote: >> A couple of comments: >> >> --POSTWRITE and POSTREAD are only asynchronous if you call them from an >> asynchronous context. >> For a driver that's already performing DMA operations on usermode memory, >> it seems likely that there's going to be *some* place where you can call >> bus_dmamap_sync() and be guaranteed to be executing in the context of the >> process that owns the memory. Then a regular bcopy will be safe and >> inexpensive, assuming the pages have been properly vslock-ed/vm_map_wire-d. >> That's usually whatever read/write/ioctl operation spawned the DMA transfer >> in the first place. So, in those cases can you not just move the >> POSTREAD/POSTWRITE sync from the "DMA-finished" interrupt to the >> d_read/d_write/d_ioctl that waits on the "DMA-finished" interrupt? >> >> --physcopyin/physcopyout aren't trivial. They go through uiomove_fromphys, >> which often uses sfbufs to create temporary KVA mappings for the physical >> pages. sf_buf_alloc() can sleep (unless SFB_NOWAIT is specified, which >> means it can fail and which uiomove_fromphys does not specify for good >> reason); that makes it unsafe for use in either a threaded interrupt or a >> filter. Perhaps the physcopyout path could be changed to use pmap_qenter >> directly in this case, but that can still be expensive in terms of TLB >> shootdowns. >> >> Checking against VM_MIN_KERNEL_ADDRESS seems sketchy; it eliminates the >> chance to use a much-less-expensive bcopy in cases where the sync is >> happening in correct process context. >> >> Context-switching during bus_dmamap_sync() shouldn't be an issue. In a >> filter interrupt, curproc will be completely arbitrary but none of this >> stuff should be called in a filter anyway. Otherwise, if you're in a >> kernel thread (including an ithread), curproc should be whatever proc was >> supplied when the thread was created. That's usually proc0, which only has >> kernel address space. IOW, even if a context-switch happens sometime >> during bus_dmamap_sync, any pmap check or copy should have a consistent and >> non-arbitrary process context. >> >> I think something like your second solution would be workable to make >> UIO_USERSPACE syncs work in non-interrupt kernel threads, but given all the >> restrictions and extra cost of physcopy, I'm not sure how useful that would >> be. >> >> I do think busdma.9 could at least use a note that bus_dmamap_sync() is >> only safe to call in the context of the owning process for user buffers. > > UIO_USERSPACE for busdma is absolutely unsafe and cannot be used without > making kernel panicing. Even if you wire the userspace bufer, there is > nothing which could prevent other thread in the user process, or other > process sharing the same address space, to call munmap(2) on the range. > Using of vslock() is proposed method in bus_dma man page. IMO, the function looks complex and can be a big time eater. However, are you saying that vslock() does not work for that? Then for what reason does that function exist? > The only safe method to work with the userspace regions is to > vm_fault_quick_hold() them to get hold on the pages, and then either > pass pages array down, or remap them in the KVA with pmap_qenter(). > So, even vm_fault_quick_hold() does not keep valid user mapping? >> >> >> On Fri, Apr 24, 2015 at 8:13 AM, Svatopluk Kraus <onwahe@gmail.com> wrote: >> >> > DMA can be done on client buffer from user address space. For example, >> > thru bus_dmamap_load_uio() when uio->uio_segflg is UIO_USERSPACE. Such >> > client buffer can bounce and then, it must be copied to and from >> > bounce buffer in bus_dmamap_sync(). >> > >> > Current implementations (in all archs) do not take into account that >> > bus_dmamap_sync() is asynchronous for POSTWRITE and POSTREAD in >> > general. It can be asynchronous for PREWRITE and PREREAD too. For >> > example, in some driver implementations where DMA client buffers >> > operations are buffered. In those cases, simple bcopy() is not >> > correct. >> > >> > Demonstration of current implementation (x86) is the following: >> > >> > ----------------------------- >> > struct bounce_page { >> > vm_offset_t vaddr; /* kva of bounce buffer */ >> > bus_addr_t busaddr; /* Physical address */ >> > vm_offset_t datavaddr; /* kva of client data */ >> > bus_addr_t dataaddr; /* client physical address */ >> > bus_size_t datacount; /* client data count */ >> > STAILQ_ENTRY(bounce_page) links; >> > }; >> > >> > >> > if ((op & BUS_DMASYNC_PREWRITE) != 0) { >> > while (bpage != NULL) { >> > if (bpage->datavaddr != 0) { >> > bcopy((void *)bpage->datavaddr, >> > (void *)bpage->vaddr, >> > bpage->datacount); >> > } else { >> > physcopyout(bpage->dataaddr, >> > (void *)bpage->vaddr, >> > bpage->datacount); >> > } >> > bpage = STAILQ_NEXT(bpage, links); >> > } >> > dmat->bounce_zone->total_bounced++; >> > } >> > ----------------------------- >> > >> > There are two things: >> > >> > (1) datavaddr is not always kva of client data, but sometimes it can >> > be uva of client data. >> > (2) bcopy() can be used only if datavaddr is kva or when map->pmap is >> > current pmap. >> > >> > Note that there is an implication for bus_dmamap_load_uio() with >> > uio->uio_segflg set to UIO_USERSPACE that used physical pages are >> > in-core and wired. See "man bus_dma". >> > >> > There is not public interface to check that map->pmap is current pmap. >> > So one solution is the following: >> > >> > if (bpage->datavaddr >= VM_MIN_KERNEL_ADDRESS) { >> > bcopy(); >> > } else { >> > physcopy(); >> > } >> > >> > If there will be public pmap_is_current() then another solution is the >> > following: >> > >> > if (bpage->datavaddr != 0) && pmap_is_current(map->pmap)) { >> > bcopy(); >> > } else { >> > physcopy(); >> > } >> > >> > The second solution implies that context switch must not happen during >> > bus_dmamap_sync() called from an interrupt routine. However, IMO, it's >> > granted. >> > >> > Note that map->pmap should be always kernel_pmap for datavaddr >= >> > VM_MIN_KERNEL_ADDRESS. >> > >> > Comments, different solutions, or objections? >> > >> > Svatopluk Kraus >> > _______________________________________________ >> > freebsd-arch@freebsd.org mailing list >> > http://lists.freebsd.org/mailman/listinfo/freebsd-arch >> > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >> > >> _______________________________________________ >> freebsd-arch@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-arch >> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFHCsPXzU_Cu51bj8vc_5e7to4GqdFDUvHvwWM1mAxJ5LvemXw>