Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Jun 2009 12:49:43 +0200
From:      Piotr =?utf-8?q?Zi=C4=99cik?= <kosmo@semihalf.com>
To:        Hans Petter Selasky <hselasky@c2i.net>
Cc:        freebsd-arm@freebsd.org, freebsd-usb@freebsd.org, thompsa@freebsd.org
Subject:   Re: CPU Cache and busdma usage in USB
Message-ID:  <200906291249.43983.kosmo@semihalf.com>
In-Reply-To: <200906291155.13941.hselasky@c2i.net>
References:  <200906231035.43096.kosmo@semihalf.com> <200906291111.20725.kosmo@semihalf.com> <200906291155.13941.hselasky@c2i.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Monday 29 June 2009 11:55:11 Hans Petter Selasky napisa=C5=82(a):
> Hi Piotr and Rafal,
>
> I'm not saying everything is OK, I just don't agree that the problem is in
> USB.
>
> If you look at:
>
> http://fxr.watson.org/fxr/source/i386/i386/busdma_machdep.c#L931
>
> Then you see the XXX_PREXXX means copy from client buffer
> (bpage->datavaddr) to real DMA buffer (bpage->vaddr). That means flush
> memory from cache to RAM. You want to change it into a XXX_POSTXXX (???) =
in
> the USB code, and that won't work for x86 and amd64 ...
>
> Then you also see that XXX_POSTXXX is doing the opposite, cleaning the
> cache and copying from RAM into the "cache" buffer.
>
> Isn't this correct flushing the cache to RAM on ARM?
>

It is correct if you use busdma in correct way:
	[... prepare data to transfer ...]
	bus_dmamap_sync(..., PREREAD | PREWRITE);
	[... do transfer ...]
	bus_dmamap_sync(..., POSTREAD | POSTWRITE);
	[... check results ...]

> > I do not see direct coherency between flags passed to bus_dma and cache
> > operations, which could be source of problem.
>
> Can you explain a little bit more what you see is not coherent?

I thought about direct relation between calling bus_dmamap_sync()
with given flag and cache operation. usb_pc_cpu_invalidate() not always
is doing invalidate and usb_pc_cpu_flush() not always is doing flush.

> > In my oppinion usb_pc_cpu_invalidate() used here suggests that it is
> > doing cache invalidation not "Perform any synchronization required after
> > an update of host memory by the device and prior to CPU access to host
> > memory".
>
> The invalidate means: Clean the cache, so that the succeeding read fetches
> its value from RAM.

But usb_pc_cpu_invalidate(), which should do cache invalidate, doing
NOP or other operations depending on architecture bus_dma implementation.=20
Therefore these functions should not be used for enforcing cache operations.

> > As this function is implemented as bus_dmamap_sync() all busdma rules
> > should be applied:
> > <cite>
> > If read and write operations are not preceded and followed by the
> > appropriate synchronization operations, behavior is undefined.
> > </cite>
> >
> > In code shown above (and many more places in USB stack) there is no
> > following synchronization operation, which also could be source of
> > problem.
>
> All cases where transfer descriptors are updated are wrapped with bus-sync
> operations, through the "usb_pc_cpu_xxx()" functions. Else mutexes are
> used. Please give an example of such a place where improper synchronisati=
on
> happens.

Look into ehci_check_transfer() function=20
(http://fxr.watson.org/fxr/source/dev/usb/controller/ehci.c#L1294)=20

usb_pc_cpu_invalidate() [bus_dmamap_sync()] is not used in this function=20
correcly. It is not paired with usb_pc_cpu_flush() [opposite=20
bus_dmamap_sync()] as busdma requires (see part of manpage cited above).
The same problem is in part of code shown in previous mail.

If usb_pc_cpu_invalidate()/usb_pc_cpu_flush() functions had been implemente=
d=20
without using busdma, for example as cpu_*() functions, ehci_check_transfer=
()=20
would have been 100% correct. In current code busdma requirements are simpl=
y=20
not met.

> > My major question here is why bus_dma is used for flushing and
> > invalidation CPU caches instead of cpu_*() functions wich was written f=
or
> > this purpuose.
>
> Because that is what other drivers are doing. I think using cpu_*() is mo=
re
> alike Linux, and also, if the memory is on a bounce page, cpu_*() will
> yield the wrong result.

Ok. So if you use bus_dma(), please use it in correct way.

=2D-=20
Best Regards.
Piotr Ziecik



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200906291249.43983.kosmo>