Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Dec 2012 12:12:55 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Hans Petter Selasky <hselasky@c2i.net>
Cc:        Oleksandr Tymoshenko <gonzo@freebsd.org>, Andrew Turner <andrew@fubar.geek.nz>, freebsd-usb@freebsd.org
Subject:   Re: EHCI on armv6 with Write-Back caches
Message-ID:  <4A66C6C9-22EC-45AA-987D-49F958D7A8F9@bsdimp.com>
In-Reply-To: <201212190956.28609.hselasky@c2i.net>
References:  <20121218204931.5322922d@fubar.geek.nz> <201212182044.11326.hselasky@c2i.net> <1482FC19-720A-480A-BAD2-C8CD306E2E5F@bsdimp.com> <201212190956.28609.hselasky@c2i.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On Dec 19, 2012, at 1:56 AM, Hans Petter Selasky wrote:

> Hi again,
>=20
> Different vendors use different naming conventions about sync =
operations.=20

The different names are often for subtly different types of cache =
flushing.

> Maybe we should start defining some names and agree about that first?
>=20
> usb_pc_cpu_flush:
>=20
> This function is a system abstraction which is supposed to ensure that =
all CPU=20
> cached data for the given buffer is written to RAM before this =
function=20
> returns.

First off, why can't you just use the normal busdma interface?

Second, does this also invalidate the cache lines? There's many kinds of =
flushes as well, depending on the cache coherency model for MP..

> usb_pc_cpu_invalidate:
>=20
> This function is a system abstraction which is supposed to ensure that =
all CPU=20
> cached data is cleared for the given buffer.

So does this throw the data that's in the cache away? Or does it write =
back before discarding it?

> These functions have been carefully added in all the USB drivers using =
DMA.
>=20
> Atomicity:
>=20
> I understand that the ARM hardware is not always compatible to this =
approach.

Yes, since it fails to capture the subtly of the ARM hardware which the =
busdma abstraction captures.

> 1) Flushing data to RAM is not a problem in any case? Do you agree?

As long as there's no DMA outstanding to the memory addressed by the =
cache line being flushed, yes.

> 2) Invalidating data is a problem, because invalidation can cause =
nearby data=20
> to be cleared aswell. So basically for those systems which are not =
handling=20
> this, flushing data means a lock of all CPU's until the =
flush/invalidate=20
> sequence is complete? Any dispute about this?

Yes. You can generally only invalidate to the cache line level. That's =
why I was asking about the subtly in what you mean by invalidate.  ARM =
can and does flush data properly, but this might not match your usage.

You forgot

3) Touching data in the same cache line while the DMA is going on.

This is a complete no-no and must never be done. Cache lines must not be =
polluted while the DMA is happening.

> If the CPU does not support certain features we cannot have an =
efficient=20
> system. It is like having a CPU which doesn't support switching off =
interrupt=20
> levels, like an 8-bit AVR.

Most RISC CPUs don't support a coherent cache, it is true. However, they =
do support efficient and succinct cache operations. The busdma =
abtraction handles this nicely for all other kinds of devices, why is =
usb so different?

> Then no matter how you twist it, you cannot=20
> postpone an interrupt to a software thread. Same goes for DMA support. =
If=20
> their DMA engine doesn't support byte granularity and possibility to=20=

> flush/invalidate on a per-byte basis, then implement a global system =
lock to=20
> flush/invalidate data like I suggest, if this is not doable in the =
hardware by=20
> the CPU instruction set.

The DMA engine support it. You can do byte-aligned transfers all day =
long.

However, the cache does not. You must segregate data to get proper =
operations. There's nothing that you can do about that. You must not mix =
accesses to memory within a cache line. To do so will cause problems, as =
evidence by the problems we are seeing.

> The approach that I was recommended several years ago, is that I can =
pass a=20
> pointer to a buffer, which then can be transferred by the USB engine. =
This=20
> pointer can be any pointer except NULL.
>=20
> 3) As per my knowledge, using busdma to allocate a separate buffer for =
a 13-
> byte buffer, results in having a buffer of PAGE_SIZE bytes allocated.

Not with Ian's slab allocator...

> 4) BUSDMA experts: Please verify that the flags passed to =
bus_dmamap_sync()=20
> causes the exact behaviour has listed on top of this e-mail to occur =
in the=20
> following two functions in sys/dev/usb/usb_busdma.c
>=20
> void
> usb_pc_cpu_invalidate(struct usb_page_cache *pc)
> {
>        if (pc->page_offset_end =3D=3D pc->page_offset_buf) {
>                /* nothing has been loaded into this page cache! */
>                return;
>        }
>=20
>        /*
>         * TODO: We currently do XXX_POSTREAD and XXX_PREREAD at the
>         * same time, but in the future we should try to isolate the
>         * different cases to optimise the code. --HPS
>         */
>        bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_POSTREAD);
>        bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREREAD);
> }

This is almost certainly wrong.  POSTREAD should be called right after =
the read had completed.  PREREAD should be called immediately before =
starting a DMA operation after all adjacent memory accesses have =
stopped.

> void
> usb_pc_cpu_flush(struct usb_page_cache *pc)
> {
>        if (pc->page_offset_end =3D=3D pc->page_offset_buf) {
>                /* nothing has been loaded into this page cache! */
>                return;
>        }
>        bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREWRITE);
> }

This is likely correct, but insufficient.  There's no POSTWRITE sync =
that's done after all the data has been transferred.

Warner




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A66C6C9-22EC-45AA-987D-49F958D7A8F9>