Date: Mon, 29 Jun 2009 11:55:11 +0200 From: Hans Petter Selasky <hselasky@c2i.net> To: Piotr =?utf-8?q?Zi=C4=99cik?= <kosmo@semihalf.com> Cc: freebsd-arm@freebsd.org, freebsd-usb@freebsd.org, thompsa@freebsd.org Subject: Re: CPU Cache and busdma usage in USB Message-ID: <200906291155.13941.hselasky@c2i.net> In-Reply-To: <200906291111.20725.kosmo@semihalf.com> References: <200906231035.43096.kosmo@semihalf.com> <200906281154.43392.hselasky@c2i.net> <200906291111.20725.kosmo@semihalf.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Piotr and Rafal, On Monday 29 June 2009 11:11:20 Piotr Zi=C4=99cik wrote: > Sunday 28 June 2009 11:54:40 Hans Petter Selasky napisa=C5=82(a): > > Hi Piotr and Rafal, > > > > Your patch is not fully correct. It will break support for x86 and more > > when bounce pages are uses. > > > > Let's get the definitions right: > > > > man busdma > > (...) > > > > My view: > > > > XXX_PREXXX functions should be used prior to read/write device access. > > > > In other words, PRE has to be a flush operation. > > > > XXX_POSTXXX functions should be used after read/write device access. > > > > In other words, POST has to be an invalidate operation. > > > > Reading: > > > > src/sys/arm/arm/busdma_machdep.c > > > > I find bus_dmamap_sync_buf() to be coherent with this view. > > If everything is OK, then why USB does not work on ARM and MIPS ? I'm not saying everything is OK, I just don't agree that the problem is in= =20 USB. If you look at: http://fxr.watson.org/fxr/source/i386/i386/busdma_machdep.c#L931 Then you see the XXX_PREXXX means copy from client buffer (bpage->datavaddr= )=20 to real DMA buffer (bpage->vaddr). That means flush memory from cache to RA= M.=20 You want to change it into a XXX_POSTXXX (???) in the USB code, and that wo= n't=20 work for x86 and amd64 ... Then you also see that XXX_POSTXXX is doing the opposite, cleaning the cach= e=20 and copying from RAM into the "cache" buffer. > Let's look into busdma for these platforms: > > usb_pc_cpu_invalidate(): [BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE] > i386: NOP > arm: Invalidate + hacks for VIPT cache. > mips: NOP Isn't this correct cleaning the cache on ARM? > > usb_pc_cpu_flush(): [BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE] > i386: NOP > arm: Writeback invalidate > mips: Writeback invalidate. Isn't this correct flushing the cache to RAM on ARM? > > I do not see direct coherency between flags passed to bus_dma and cache > operations, which could be source of problem. Can you explain a little bit more what you see is not coherent? > > Let's also look at usb_pc_cpu_invalidate() usage in > sys/dev/usb/controller/ehci.c: > <cite> > while (1) { > usb_pc_cpu_invalidate(td->page_cache); > status =3D hc32toh(sc, td->qtd_status); > > (...) > > if (status & EHCI_QTD_HALTED) { > break; > } > > (...) > > td =3D td->obj_next; > </cite> > > In my oppinion usb_pc_cpu_invalidate() used here suggests that it is doing > cache invalidation not "Perform any synchronization required after > an update of host memory by the device and prior to CPU access to host > memory". The invalidate means: Clean the cache, so that the succeeding read fetches = its=20 value from RAM. > > As this function is implemented as bus_dmamap_sync() all busdma rules > should be applied: > <cite> > If read and write operations are not preceded and followed by the > appropriate synchronization operations, behavior is undefined. > </cite> > > In code shown above (and many more places in USB stack) there is no > following synchronization operation, which also could be source of proble= m. All cases where transfer descriptors are updated are wrapped with bus-sync= =20 operations, through the "usb_pc_cpu_xxx()" functions. Else mutexes are used= =2E=20 Please give an example of such a place where improper synchronisation happe= ns. > > My major question here is why bus_dma is used for flushing and invalidati= on > CPU caches instead of cpu_*() functions wich was written for this purpuos= e. Because that is what other drivers are doing. I think using cpu_*() is more= =20 alike Linux, and also, if the memory is on a bounce page, cpu_*() will yiel= d=20 the wrong result. > > Can you check if the COHERENT bit is set for your allocation? > > > > if (map->flags & DMAMAP_COHERENT) > > return; > > No. This bit is not set. > > > Summed up: > > > > The existing code is doing correct. What is known is a problem with the > > memory mapping, so that the same memory page can get mapped with > > different attributes, which makes the problem appear. > =2D-HPS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200906291155.13941.hselasky>