From owner-freebsd-arch@FreeBSD.ORG Fri Apr 24 22:50:17 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9CFB1E97 for ; Fri, 24 Apr 2015 22:50:17 +0000 (UTC) Received: from mail-ig0-x22a.google.com (mail-ig0-x22a.google.com [IPv6:2607:f8b0:4001:c05::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6D06E1DB7 for ; Fri, 24 Apr 2015 22:50:17 +0000 (UTC) Received: by igblo3 with SMTP id lo3so26314346igb.1 for ; Fri, 24 Apr 2015 15:50:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=+6oZdQd1ywzWxCvca3iMRpGGsKaLnEXcKMTmIeoLY9o=; b=M8nljuxr0nsgKx96usLmWtRW34WeR0npjC58+NEOpNUOfMuOY/KI5hD8h2684zeTzH TDjRBvACEomuKh6N0x7xqEcCWgs6AWmsxoRxT/A0sPgmW37btAqWb1OpjU2IF6Rda/Ya 3yXE+SvqM23maG9+xMwXPJV/JTpbuBE9CptFC9LHPT2dRq1dvhLNYL8z/HubWkou9K1J 9pZiGgol6V3vYog2iSceK7SatyI+7qMaaSO30dTWqp2RwYnJL68y5mgO9V1HVOM6xEtl AFvcsk7zMSenU0y42YBLzF2ibq4FgHkZObXeM5W8VDhge831cCCCL4e2ZLDFoWxenR6Z NWBg== MIME-Version: 1.0 X-Received: by 10.50.36.66 with SMTP id o2mr354712igj.16.1429915815629; Fri, 24 Apr 2015 15:50:15 -0700 (PDT) Received: by 10.36.106.70 with HTTP; Fri, 24 Apr 2015 15:50:15 -0700 (PDT) In-Reply-To: References: Date: Fri, 24 Apr 2015 17:50:15 -0500 Message-ID: Subject: Re: bus_dmamap_sync() for bounced client buffers from user address space From: Jason Harmening To: Svatopluk Kraus Cc: FreeBSD Arch Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2015 22:50:17 -0000 A couple of comments: --POSTWRITE and POSTREAD are only asynchronous if you call them from an asynchronous context. For a driver that's already performing DMA operations on usermode memory, it seems likely that there's going to be *some* place where you can call bus_dmamap_sync() and be guaranteed to be executing in the context of the process that owns the memory. Then a regular bcopy will be safe and inexpensive, assuming the pages have been properly vslock-ed/vm_map_wire-d. That's usually whatever read/write/ioctl operation spawned the DMA transfer in the first place. So, in those cases can you not just move the POSTREAD/POSTWRITE sync from the "DMA-finished" interrupt to the d_read/d_write/d_ioctl that waits on the "DMA-finished" interrupt? --physcopyin/physcopyout aren't trivial. They go through uiomove_fromphys, which often uses sfbufs to create temporary KVA mappings for the physical pages. sf_buf_alloc() can sleep (unless SFB_NOWAIT is specified, which means it can fail and which uiomove_fromphys does not specify for good reason); that makes it unsafe for use in either a threaded interrupt or a filter. Perhaps the physcopyout path could be changed to use pmap_qenter directly in this case, but that can still be expensive in terms of TLB shootdowns. Checking against VM_MIN_KERNEL_ADDRESS seems sketchy; it eliminates the chance to use a much-less-expensive bcopy in cases where the sync is happening in correct process context. Context-switching during bus_dmamap_sync() shouldn't be an issue. In a filter interrupt, curproc will be completely arbitrary but none of this stuff should be called in a filter anyway. Otherwise, if you're in a kernel thread (including an ithread), curproc should be whatever proc was supplied when the thread was created. That's usually proc0, which only has kernel address space. IOW, even if a context-switch happens sometime during bus_dmamap_sync, any pmap check or copy should have a consistent and non-arbitrary process context. I think something like your second solution would be workable to make UIO_USERSPACE syncs work in non-interrupt kernel threads, but given all the restrictions and extra cost of physcopy, I'm not sure how useful that would be. I do think busdma.9 could at least use a note that bus_dmamap_sync() is only safe to call in the context of the owning process for user buffers. On Fri, Apr 24, 2015 at 8:13 AM, Svatopluk Kraus wrote: > DMA can be done on client buffer from user address space. For example, > thru bus_dmamap_load_uio() when uio->uio_segflg is UIO_USERSPACE. Such > client buffer can bounce and then, it must be copied to and from > bounce buffer in bus_dmamap_sync(). > > Current implementations (in all archs) do not take into account that > bus_dmamap_sync() is asynchronous for POSTWRITE and POSTREAD in > general. It can be asynchronous for PREWRITE and PREREAD too. For > example, in some driver implementations where DMA client buffers > operations are buffered. In those cases, simple bcopy() is not > correct. > > Demonstration of current implementation (x86) is the following: > > ----------------------------- > struct bounce_page { > vm_offset_t vaddr; /* kva of bounce buffer */ > bus_addr_t busaddr; /* Physical address */ > vm_offset_t datavaddr; /* kva of client data */ > bus_addr_t dataaddr; /* client physical address */ > bus_size_t datacount; /* client data count */ > STAILQ_ENTRY(bounce_page) links; > }; > > > if ((op & BUS_DMASYNC_PREWRITE) != 0) { > while (bpage != NULL) { > if (bpage->datavaddr != 0) { > bcopy((void *)bpage->datavaddr, > (void *)bpage->vaddr, > bpage->datacount); > } else { > physcopyout(bpage->dataaddr, > (void *)bpage->vaddr, > bpage->datacount); > } > bpage = STAILQ_NEXT(bpage, links); > } > dmat->bounce_zone->total_bounced++; > } > ----------------------------- > > There are two things: > > (1) datavaddr is not always kva of client data, but sometimes it can > be uva of client data. > (2) bcopy() can be used only if datavaddr is kva or when map->pmap is > current pmap. > > Note that there is an implication for bus_dmamap_load_uio() with > uio->uio_segflg set to UIO_USERSPACE that used physical pages are > in-core and wired. See "man bus_dma". > > There is not public interface to check that map->pmap is current pmap. > So one solution is the following: > > if (bpage->datavaddr >= VM_MIN_KERNEL_ADDRESS) { > bcopy(); > } else { > physcopy(); > } > > If there will be public pmap_is_current() then another solution is the > following: > > if (bpage->datavaddr != 0) && pmap_is_current(map->pmap)) { > bcopy(); > } else { > physcopy(); > } > > The second solution implies that context switch must not happen during > bus_dmamap_sync() called from an interrupt routine. However, IMO, it's > granted. > > Note that map->pmap should be always kernel_pmap for datavaddr >= > VM_MIN_KERNEL_ADDRESS. > > Comments, different solutions, or objections? > > Svatopluk Kraus > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >