Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Sep 2013 13:19:29 +0200
From:      Svatopluk Kraus <onwahe@gmail.com>
To:        freebsd-arm@freebsd.org
Subject:   Architecture vs. bus vs. device DMA cache coherency
Message-ID:  <CAFHCsPXe97B3YKURXZUqiNtWzopJZ2e00qaOBBU16nQYMM1pdg@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,

I and Michal Meloun have discussed a lot about bus DMA framework in FreeBSD
at work and these are our conclusions related to DMA cache coherency issue
implementation for ARM architecture. We offer them for ARM community.

---------------------------------------------------
Architecture vs. bus vs. device DMA cache coherency
---------------------------------------------------

Even in DMA cache coherent architectures there could be not-coherent DMA
busses and/or devices. Thus, each bus and/or device should be described by
its bus_dma_tag and the tag should carry information about DMA cache
coherency.

DMA device can be either cache coherent or not-coherent. Its coherency
often depends on bus properties along a path to memory. But not always.
Either DMA device or bus itself can supply extra info outside its standard
bus path and so be cache coherent even on not-coherent buses. Thus DMA
device or bus coherency should be inherited from (defaulted to) parent bus,
but DMA device or bus could override it.

The information about DMA cache coherency should be carried by bus_dma_tag.
In a driver, each DMA device should have its own bus_dma_tag which says
that DMA is cache coherent or not.

A type of DMA device (cache coherent or not-coherent) is put in bus_dma_tag.

------------------------------------------
int bus_dma_tag_create(parent, flags, ...)
------------------------------------------

Used by DMA devices and buses.

Following flags, if set, mean that bus or device is either
BUS_DMA_COHERENT    ... coherent or
BUS_DMA_NOTCOHERENT ... not coherent.

Coherency flag is either given explicitly or inherited from (defaulted to)
parent. Although BUS_DMA_COHERENT or BUS_DMA_NOTCOHERENT flag can be given
explicitly, only BUS_DMA_COHERENT flag is used in bus_dma_tag internally.

Note that a bus can be coherent even if parent bus is not and vice versa.

#define BUS_DMA_TAG_INHERITANCE_MASK \
    (BUS_DMA_COHERENT | BUS_DMA_COULD_BOUNCE)

tag->flags = parent->flags & BUS_DMA_TAG_INHERITANCE_MASK;
if (flags & BUS_DMA_COHERENT)
        tag->flags |= BUS_DMA_COHERENT;
else if (flags & BUS_DMA_NOTCOHERENT)
        tag->flags &= ~BUS_DMA_COHERENT;


In real OS, cache coherency must always be ensured if DMA is used. If DMA
device is not cache coherent, we must ensure coherency by software. There
are two basic approches:
(1) to create some kind of synchronization list to keep cache coherent,
(2) to make memory un-cacheable so DMA cache coherency is not an issue.

Let's work for now with this two approches only. However, in general, in
world of various architectures and hardware, there are and could be other
ways how to ensure cache coherency by software.

For performance reasons or because of a need (hardware bug), we want to
choose how memory (buffers) used by DMA device will be accessed.

--------------------------------------
int bus_dmamap_create(tag, flags, ...)
--------------------------------------

Used by DMA devices for dedicated external buffer(s). It could be used for
internal buffers too, however, bus_dmamem_alloc() is preferred method for
them.

Following flags, if set, mean that dedicated buffers will be accessed either
BUS_DMA_COHERENT    ... as coherent or
BUS_DMA_NOTCOHERENT ... as not-coherent.
Otherwise, default behaviour is taken from bus_dma_tag.

--------------|-------------------------|
       \ flag |          |              |
   tag  \     | coherent | not-coherent |
--------------|----------|--------------|
   coherent   |    A1    |      A2      |
--------------|----------|--------------|
 not-coherent |    B1    |      B2      |
--------------|----------|--------------|

(A1) This is default for coherent DMA devices. We need to do nothing.
(A2) This means that buffers coherency is not ensured on coherent DMA
     device for some reason. We need to use synchronization lists.
(B1) This means that buffers coherency is ensured externally even on
     not-coherent DMA device. We need to do nothing.
(B2) This is default for not-coherent DMA devices. We need to use
     synchronization lists.

if ((flags & BUS_DMA_NOTCOHERENT) || (((flags & BUS_DMA_NOTCOHERENT) == 0)
&&
    ((tag->flags & BUS_DMA_COHERENT) == 0)))
        map->flags |= DMAMAP_SYNCLIST;

-------------------------------------
int bus_dmamem_alloc(tag, flags, ...)
-------------------------------------

Used by DMA devices for specific internal buffer.

Following flags, if set, mean that specific buffer must be accessed either
BUS_DMA_COHERENT    ... as coherent or
BUS_DMA_NOTCOHERENT ... as not-coherent.
Otherwise, default behaviour is taken from bus_dma_tag.

--------------|-------------------------|
       \ flag |          |              |
   tag  \     | coherent | not-coherent |
--------------|----------|--------------|
   coherent   |    A1    |      A2      |
--------------|----------|--------------|
 not-coherent |    B1    |      B2      |
--------------|----------|--------------|

(A1) This is default for coherent DMA devices. The buffer is allocated
     with default attributes. We need to do nothing more.
(A2) This means that buffer coherency must be ensured another way. It is
     not standard, so best choice how to ensure that is to allocate buffer
     with uncacheable attributes. We need to do nothing more.
(B1) This means that buffer coherency must not be ensured by synchronization
     list (default). The buffer is allocated with uncacheable attributes.
     We need to do nothing more.
(B2) This is default for not-coherent DMA devices. The buffer is allocated
     with default attributes. We need to use synchronization list.

The flags is request for allocation and must be fulfilled (because of B1
case mainly) or error should be returned. A bus_dmamap is created together
with buffer.

bus_dmamap_t map;

if (((flags & BUS_DMA_COHERENT) == 0) && ((tag->flags & BUS_DMA_COHERENT)
== 0))
        map->flags |= DMAMAP_SYNCLIST;

-------------------------------------------------
int _bus_dmamap_load_buffer(..., map, flags, ...)
-------------------------------------------------

Used by DMA devices for given buffer.

All needed information about coherency should already be provided by
bus_dmamap.
Synchronization list is created only for bus_dmamaps labeled by
DMAMAP_SYNCLIST flag.

-----------------------
void _bus_dmamap_sync()
-----------------------

As we do create synchronization list only for bus_dmamaps with
DMAMAP_SYNCLIST flag set, we don't need to check the flag here.

------------------------------------
Bounce pages vs. DMA cache coherency
------------------------------------
In case that bus_dma_tag is labeled as coherent, we don't need to keep
cache coherency on bounce pages by software. If bounce pages will be
allocated with un-cacheable attributes, there will be no DMA cache
coherency issue at all. If both cacheable and un-cacheable bounce pages
will be implemented, then the bounce pages should be labeled too.

Svatopluk Kraus



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFHCsPXe97B3YKURXZUqiNtWzopJZ2e00qaOBBU16nQYMM1pdg>