From owner-freebsd-arch@FreeBSD.ORG Sat Apr 25 09:41:59 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8F0C253E for ; Sat, 25 Apr 2015 09:41:59 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 193F91BA0 for ; Sat, 25 Apr 2015 09:41:58 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t3P9frQR000982 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 25 Apr 2015 12:41:53 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t3P9frQR000982 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t3P9fqVn000981; Sat, 25 Apr 2015 12:41:52 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 25 Apr 2015 12:41:52 +0300 From: Konstantin Belousov To: Jason Harmening Cc: Svatopluk Kraus , FreeBSD Arch Subject: Re: bus_dmamap_sync() for bounced client buffers from user address space Message-ID: <20150425094152.GE2390@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Apr 2015 09:41:59 -0000 On Fri, Apr 24, 2015 at 05:50:15PM -0500, Jason Harmening wrote: > A couple of comments: > > --POSTWRITE and POSTREAD are only asynchronous if you call them from an > asynchronous context. > For a driver that's already performing DMA operations on usermode memory, > it seems likely that there's going to be *some* place where you can call > bus_dmamap_sync() and be guaranteed to be executing in the context of the > process that owns the memory. Then a regular bcopy will be safe and > inexpensive, assuming the pages have been properly vslock-ed/vm_map_wire-d. > That's usually whatever read/write/ioctl operation spawned the DMA transfer > in the first place. So, in those cases can you not just move the > POSTREAD/POSTWRITE sync from the "DMA-finished" interrupt to the > d_read/d_write/d_ioctl that waits on the "DMA-finished" interrupt? > > --physcopyin/physcopyout aren't trivial. They go through uiomove_fromphys, > which often uses sfbufs to create temporary KVA mappings for the physical > pages. sf_buf_alloc() can sleep (unless SFB_NOWAIT is specified, which > means it can fail and which uiomove_fromphys does not specify for good > reason); that makes it unsafe for use in either a threaded interrupt or a > filter. Perhaps the physcopyout path could be changed to use pmap_qenter > directly in this case, but that can still be expensive in terms of TLB > shootdowns. > > Checking against VM_MIN_KERNEL_ADDRESS seems sketchy; it eliminates the > chance to use a much-less-expensive bcopy in cases where the sync is > happening in correct process context. > > Context-switching during bus_dmamap_sync() shouldn't be an issue. In a > filter interrupt, curproc will be completely arbitrary but none of this > stuff should be called in a filter anyway. Otherwise, if you're in a > kernel thread (including an ithread), curproc should be whatever proc was > supplied when the thread was created. That's usually proc0, which only has > kernel address space. IOW, even if a context-switch happens sometime > during bus_dmamap_sync, any pmap check or copy should have a consistent and > non-arbitrary process context. > > I think something like your second solution would be workable to make > UIO_USERSPACE syncs work in non-interrupt kernel threads, but given all the > restrictions and extra cost of physcopy, I'm not sure how useful that would > be. > > I do think busdma.9 could at least use a note that bus_dmamap_sync() is > only safe to call in the context of the owning process for user buffers. UIO_USERSPACE for busdma is absolutely unsafe and cannot be used without making kernel panicing. Even if you wire the userspace bufer, there is nothing which could prevent other thread in the user process, or other process sharing the same address space, to call munmap(2) on the range. The only safe method to work with the userspace regions is to vm_fault_quick_hold() them to get hold on the pages, and then either pass pages array down, or remap them in the KVA with pmap_qenter(). > > > On Fri, Apr 24, 2015 at 8:13 AM, Svatopluk Kraus wrote: > > > DMA can be done on client buffer from user address space. For example, > > thru bus_dmamap_load_uio() when uio->uio_segflg is UIO_USERSPACE. Such > > client buffer can bounce and then, it must be copied to and from > > bounce buffer in bus_dmamap_sync(). > > > > Current implementations (in all archs) do not take into account that > > bus_dmamap_sync() is asynchronous for POSTWRITE and POSTREAD in > > general. It can be asynchronous for PREWRITE and PREREAD too. For > > example, in some driver implementations where DMA client buffers > > operations are buffered. In those cases, simple bcopy() is not > > correct. > > > > Demonstration of current implementation (x86) is the following: > > > > ----------------------------- > > struct bounce_page { > > vm_offset_t vaddr; /* kva of bounce buffer */ > > bus_addr_t busaddr; /* Physical address */ > > vm_offset_t datavaddr; /* kva of client data */ > > bus_addr_t dataaddr; /* client physical address */ > > bus_size_t datacount; /* client data count */ > > STAILQ_ENTRY(bounce_page) links; > > }; > > > > > > if ((op & BUS_DMASYNC_PREWRITE) != 0) { > > while (bpage != NULL) { > > if (bpage->datavaddr != 0) { > > bcopy((void *)bpage->datavaddr, > > (void *)bpage->vaddr, > > bpage->datacount); > > } else { > > physcopyout(bpage->dataaddr, > > (void *)bpage->vaddr, > > bpage->datacount); > > } > > bpage = STAILQ_NEXT(bpage, links); > > } > > dmat->bounce_zone->total_bounced++; > > } > > ----------------------------- > > > > There are two things: > > > > (1) datavaddr is not always kva of client data, but sometimes it can > > be uva of client data. > > (2) bcopy() can be used only if datavaddr is kva or when map->pmap is > > current pmap. > > > > Note that there is an implication for bus_dmamap_load_uio() with > > uio->uio_segflg set to UIO_USERSPACE that used physical pages are > > in-core and wired. See "man bus_dma". > > > > There is not public interface to check that map->pmap is current pmap. > > So one solution is the following: > > > > if (bpage->datavaddr >= VM_MIN_KERNEL_ADDRESS) { > > bcopy(); > > } else { > > physcopy(); > > } > > > > If there will be public pmap_is_current() then another solution is the > > following: > > > > if (bpage->datavaddr != 0) && pmap_is_current(map->pmap)) { > > bcopy(); > > } else { > > physcopy(); > > } > > > > The second solution implies that context switch must not happen during > > bus_dmamap_sync() called from an interrupt routine. However, IMO, it's > > granted. > > > > Note that map->pmap should be always kernel_pmap for datavaddr >= > > VM_MIN_KERNEL_ADDRESS. > > > > Comments, different solutions, or objections? > > > > Svatopluk Kraus > > _______________________________________________ > > freebsd-arch@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"