Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Jun 2009 11:55:11 +0200
From:      Hans Petter Selasky <hselasky@c2i.net>
To:        Piotr =?utf-8?q?Zi=C4=99cik?= <kosmo@semihalf.com>
Cc:        freebsd-arm@freebsd.org, freebsd-usb@freebsd.org, thompsa@freebsd.org
Subject:   Re: CPU Cache and busdma usage in USB
Message-ID:  <200906291155.13941.hselasky@c2i.net>
In-Reply-To: <200906291111.20725.kosmo@semihalf.com>
References:  <200906231035.43096.kosmo@semihalf.com> <200906281154.43392.hselasky@c2i.net> <200906291111.20725.kosmo@semihalf.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Piotr and Rafal,


On Monday 29 June 2009 11:11:20 Piotr Zi=C4=99cik wrote:
> Sunday 28 June 2009 11:54:40 Hans Petter Selasky napisa=C5=82(a):
> > Hi Piotr and Rafal,
> >
> > Your patch is not fully correct. It will break support for x86 and more
> > when bounce pages are uses.
> >
> > Let's get the definitions right:
> >
> > man busdma
> > (...)
> >
> > My view:
> >
> > XXX_PREXXX functions should be used prior to read/write device access.
> >
> > In other words, PRE has to be a flush operation.
> >
> > XXX_POSTXXX functions should be used after read/write device access.
> >
> > In other words, POST has to be an invalidate operation.
> >
> > Reading:
> >
> > src/sys/arm/arm/busdma_machdep.c
> >
> > I find bus_dmamap_sync_buf() to be coherent with this view.
>
> If everything is OK, then why USB does not work on ARM and MIPS ?

I'm not saying everything is OK, I just don't agree that the problem is in=
=20
USB.

If you look at:

http://fxr.watson.org/fxr/source/i386/i386/busdma_machdep.c#L931

Then you see the XXX_PREXXX means copy from client buffer (bpage->datavaddr=
)=20
to real DMA buffer (bpage->vaddr). That means flush memory from cache to RA=
M.=20
You want to change it into a XXX_POSTXXX (???) in the USB code, and that wo=
n't=20
work for x86 and amd64 ...

Then you also see that XXX_POSTXXX is doing the opposite, cleaning the cach=
e=20
and copying from RAM into the "cache" buffer.

> Let's look into busdma for these platforms:


>
> usb_pc_cpu_invalidate(): [BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE]
> 	i386:		NOP
> 	arm:		Invalidate + hacks for VIPT cache.
> 	mips:		NOP

Isn't this correct cleaning the cache on ARM?

>
> usb_pc_cpu_flush(): [BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE]
> 	i386:		NOP
> 	arm:		Writeback invalidate
> 	mips:		Writeback invalidate.

Isn't this correct flushing the cache to RAM on ARM?

>
> I do not see direct coherency between flags passed to bus_dma and cache
> operations, which could be source of problem.

Can you explain a little bit more what you see is not coherent?

>
> Let's also look at usb_pc_cpu_invalidate() usage in
> sys/dev/usb/controller/ehci.c:
> <cite>
> while (1) {
> 		usb_pc_cpu_invalidate(td->page_cache);
>                 status =3D hc32toh(sc, td->qtd_status);
>
> 		(...)
>
> 		if (status & EHCI_QTD_HALTED) {
> 			break;
> 		}
>
> 		(...)
>
> 		td =3D td->obj_next;
> </cite>
>
> In my oppinion usb_pc_cpu_invalidate() used here suggests that it is doing
> cache invalidation not "Perform any synchronization required after
> an update of host memory by the device and prior to CPU access to host
> memory".

The invalidate means: Clean the cache, so that the succeeding read fetches =
its=20
value from RAM.

>
> As this function is implemented as bus_dmamap_sync() all busdma rules
> should be applied:
> <cite>
> If read and write operations are not preceded and followed by the
> appropriate synchronization operations, behavior is undefined.
> </cite>
>
> In code shown above (and many more places in USB stack) there is no
> following synchronization operation, which also could be source of proble=
m.

All cases where transfer descriptors are updated are wrapped with bus-sync=
=20
operations, through the "usb_pc_cpu_xxx()" functions. Else mutexes are used=
=2E=20
Please give an example of such a place where improper synchronisation happe=
ns.

>
> My major question here is why bus_dma is used for flushing and invalidati=
on
> CPU caches instead of cpu_*() functions wich was written for this purpuos=
e.

Because that is what other drivers are doing. I think using cpu_*() is more=
=20
alike Linux, and also, if the memory is on a bounce page, cpu_*() will yiel=
d=20
the wrong result.

> > Can you check if the COHERENT bit is set for your allocation?
> >
> >         if (map->flags & DMAMAP_COHERENT)
> >                 return;
>
> No. This bit is not set.
>
> > Summed up:
> >
> > The existing code is doing correct. What is known is a problem with the
> > memory mapping, so that the same memory page can get mapped with
> > different attributes, which makes the problem appear.
>

=2D-HPS




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200906291155.13941.hselasky>