Date: Fri, 20 Mar 2015 09:41:37 -0600 From: Ian Lepore <ian@freebsd.org> To: Warner Losh <imp@bsdimp.com> Cc: John Wehle <john@feith.com>, freebsd-arm@freebsd.org Subject: Re: current meaning of BUS_DMA_COHERENT Message-ID: <1426866097.24655.33.camel@freebsd.org> In-Reply-To: <16E52DDC-D2DC-4878-9C43-451278AE7B4E@bsdimp.com> References: <201503200535.t2K5ZQdo011380@jwlab.FEITH.COM> <1426861647.24655.12.camel@freebsd.org> <16E52DDC-D2DC-4878-9C43-451278AE7B4E@bsdimp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2015-03-20 at 09:12 -0600, Warner Losh wrote: > > On Mar 20, 2015, at 8:27 AM, Ian Lepore <ian@freebsd.org> wrote: > > What we really need is a new type of busdma memory (BUS_DMA_DESCRIPTO= R) > > and a special sync call to use in conjunction with it that takes an > > offset and length, and the sync is a single operation, no pre/post > > stuff. Then you could sync each descriptor immediately before readin= g > > and writing it, which would translate to a single cacheline flush > > instead of a loop that does all the lines in the whole ring. >=20 > We don=A2t need a special type of memory for this. NetBSD doesn=A2t hav= e > that. Instead it implements a range on the sync operation. We could eas= ily > just do that. We already have flags to disable bouncing, which is also > required for interacting with descriptor rings. >=20 IMO, we do need a special type of memory, because of how fine-grained the busdma API is. Knowing that the memory is going to be used for shared descriptors is something that should be indicated in the tag, not something that is unknown until the sync ops happen. The fact that the memory is going to be used for descriptors is essentially another type of mapping constraint, and the tag is where constraints are listed. For example, descriptor memory must be contiguous, and must not bounce, and if you don't know that at tag and map creation time you end up preallocating bounce resources that will never be used. Any given implementation may be able to allocate better resources for shared descriptor memory, this is similar to the original meaning of COHERENT. It may even be required to use a different type of resource to g'tee the semantics needed for shared descriptor access. But maybe that's a limited resource, or a special kind of memory that's not appropriate for normal bulk DMA transfers, so the implementation needs to know that up front. Conversely, trying to perform the new descriptor-style sync on memory that wasn't allocated using the DESCRIPTOR type might be an error, so that's another reason to have it flagged as something different. > I=A2m curious where you need to do both a pre read and a post read befo= re > you read the ring. Why is that needed? If you don't perform a POST operation that matches every PRE operation you invoke, you are making unwarranted assumptions about the busdma implementation. The manpage strongly implies (and IMO should just directly say) that PRE and POST sync ops come in pairs. (When bounce buffers are involved you absolutely must pair them up or really bad things happen.) Also, on any given platform, what's the right sync op to use before accessing a descriptor with the CPU that may have been updated by the hardware? Is that a PREREAD? Or a POSTREAD? Whichever you think it is, are you 100% certain that your thinking is correct for every platform and busdma implementation, and that that will remain true for all time? We have implementations that do all the work in PREREAD and POSTREAD is essentially a no-op. We have other implementations that split some of the required work between the two ops and if you don't do both halves you don't get everything in sync. -- Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1426866097.24655.33.camel>