From owner-freebsd-current@FreeBSD.ORG Thu Jul 12 15:36:18 2012 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 968E01065676; Thu, 12 Jul 2012 15:36:18 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 529F58FC1D; Thu, 12 Jul 2012 15:36:18 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 9BD23B960; Thu, 12 Jul 2012 11:36:17 -0400 (EDT) From: John Baldwin To: Ian Lepore Date: Thu, 12 Jul 2012 11:36:08 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p17; KDE/4.5.5; amd64; ; ) References: <201207121040.27116.jhb@freebsd.org> <1342105327.1123.66.camel@revolution.hippie.lan> In-Reply-To: <1342105327.1123.66.camel@revolution.hippie.lan> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201207121136.08387.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 12 Jul 2012 11:36:17 -0400 (EDT) Cc: scottl@freebsd.org, Peter Jeremy , current@freebsd.org Subject: Re: Adding support for WC (write-combining) memory to bus_dma X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jul 2012 15:36:18 -0000 On Thursday, July 12, 2012 11:02:07 am Ian Lepore wrote: > On Thu, 2012-07-12 at 10:40 -0400, John Baldwin wrote: > > I have a need to allocate static DMA memory via bus_dmamem_alloc() that is > > also WC (for a PCI-e device so it can use "nosnoop" transactions). This is > > similar to what the nvidia driver needs, but in my case it is much cleaner to > > allocate the memory via bus dma since the existing code I am extending all > > uses busdma. > > > > I have a patch to implement this on 8.x for amd64 that I can port to HEAD if > > folks don't object. What I would really like to do is add a new paramter to > > bus_dmamem_alloc() to specify the memory attribute to use, but I am hesitant > > to break that API. Instead, I added a new flag similar to the existing > > BUS_DMA_NOCACHE used to allocate UC memory. > > > > While doing this, I ran into an old bug, which is that if you were to call > > bus_dmamem_alloc() with BUS_DMA_NOCACHE but a tag that otherwise fell through > > to using malloc() instead of contigmalloc(), bus_dmamem_alloc() would actually > > change the state of the entire page. This seems wrong. Instead, I think that > > any request for a non-default memory attribute should always use > > contigmalloc(). > > The problem I have with this (already, even before your proposed > changes) is that contigmalloc() is only able to allocate pages. In the > ARM world we have a need to allocate BUS_DMA_COHERENT memory (same > effect as BUS_DMA_NOCACHE; we should consolidate these names) that is > aligned to a 32-byte boundary (cacheline-aligned) but usually the buffer > is far smaller than a page, often smaller than 1k, and sometimes we need > lots of them (allocating 128 pages for ethernet buffers, with only half > of each page used, is unreasonably expensive on a platform with only > 64mb to begin with). > > I keep thinking what's needed is a busdma allocation helper routine, > something MI that can be used by the various MD busdma implementations, > that can manage a pool of pages that are flagged as uncachable and can > subdivide those pages to provide small blocks of memory that fit various > alignment and boundary restrictions. > > To be clear, I'm not objecting to your proposed changes, I'm more just > musing that similar problems exist in non-x86 architectures and maybe an > MI solution is possible (or at least the groundwork could be laid)? The traditional argument I've heard against this is that the relevant driver should allocate a big block and manage suballocations on its own rather than pushing that work into bus_dma. How are you allocating Ethernet buffers btw? Are you not using mbuf clusters to receive packets, but allocate mbufs in your RX interrupt handler and copying data out of static buffers into the mbufs to send up the stack? Also, I do not think BUS_DMA_COHERENT and BUS_DMA_NOCACHE are quite the same. I see UC as a way to implement COHERENT semantics, but it also seems to me that a COHERENT mapping can't use bounce pages either. OTOH, NOCACHE (and my new flag), are specifically requesting a certain mapping behavior not necessarily to avoid bus_dmamap_sync() operations, but due to a hardware requirement (e.g. the WC mapping is to enable use of "nosnoop" PCI-e transactions). That is, I interpret COHERENT as meaning "this map doesn't require bus_dmamap_sync(), do whatever it takes to make that true", where as NOCACHE and WC have other meanings (though in practice NOCACHE and WC both imply COHERENT). For example, on x86 with caches that snoop DMA transactions, COHERENT doesn't require NOCACHE at all, it simply requires avoiding the use of bounce pages. -- John Baldwin