Date: Sun, 5 Jul 2009 12:36:21 +0200 From: Hans Petter Selasky <hselasky@c2i.net> To: Piotr =?iso-8859-2?q?Zi=EAcik?= <kosmo@semihalf.com> Cc: Rafal Jaworowski <raj@semihalf.com>, freebsd-arm@freebsd.org, thompsa@freebsd.org, freebsd-usb@freebsd.org Subject: Re: CPU Cache and busdma usage in USB Message-ID: <200907051236.22783.hselasky@c2i.net> In-Reply-To: <200906301128.26046.hselasky@c2i.net> References: <200906231035.43096.kosmo@semihalf.com> <200906301037.49367.kosmo@semihalf.com> <200906301128.26046.hselasky@c2i.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 30 June 2009 11:28:23 Hans Petter Selasky wrote: > On Tuesday 30 June 2009 10:37:48 Piotr Zi=EAcik wrote: > > Monday 29 June 2009 15:16:56 Hans Petter Selasky napisa=B3(a): > > > You want to change the flags passed to bus_dmamap_sync() so that the > > > flush/invalidate mapping gets right. I understand why your patch makes > > > it work. That's not the problem. > > > > > > In "src/sys/arm/arm/busdma_machdep.c" there is a function called > > > "_bus_dmamap_sync_bp()". If you look at that function you see that it > > > only triggers on the "BUS_DMASYNC_PREWRITE" and "BUS_DMASYNC_POSTREAD" > > > flags. After your patching only the PREXXXX flags are used, so if bou= ce > > > pages are used on ARM and x86 and amd64 +++, then only > > > BUS_DMASYNC_PREWRITE will do anything. This indicates that your patch > > > is not fully correct. > > Hi, > > > That is true. I have missed "bounce page" case. I can change flags pass= ed > > to bus_dmamap_sync() to fix this on ARM, but this will break MIPS. > > Right, so there is inconsistency among the platforms in how the BUSDMA is > implemented, which should be fixed. > > > This clearly shows the problem - using side effect of busdma to manage > > CPU cache. Current USB implementation relies of this side effect assumi= ng > > that bus_dmamap_sync() with given flags will do cpu cache > > flush/invalidation. This is not true even on i386 ! > > i386 handles this differently. > > > This thread is not about my patch. It is only a fast and dirty hack > > making USB working on platforms without hardware cache coherency > > and showing the problem. I see two proper solutions: > > Ok. > > > 1. Implement usb_pc_cpu_invalidate() / usb_pc_cpu_flush() with > > cpu_*() functions realizing requested operation. > > > > 2. Using busdma in proper way: > > [... prepare data to transfer ...] > > bus_dmamap_sync(..., PREREAD | PREWRITE); > > [... do transfer ...] > > bus_dmamap_sync(..., POSTREAD | POSTWRITE); > > [... check results ...] > > as manpage says: > > <cite> > > If read and write operations are not preceded and followed by the > > appropriate synchronization operations, behavior is undefined. > > </cite> > > Requirement cited above is currently not met in USB stack. Funtion > > shown in http://fxr.watson.org/fxr/source/dev/usb/controller/ehci.c#L12= 94 > > is just an example of improper usage of bus_dmamap_sync(), which is > > hidden under usb_pc_cpu_invalidate(). > > I'm not violating any rules in busdma. The busdma manual is silent about > doing multiple sync operations of the same kind in a row. If the sheeva > hardware you are using works differently, you have to fix it. > > > This thread started from my question about general usage of > > usb_pc_cpu_*() functions. So I am asking once again - why these functio= ns > > relies on side effect of busdma instead of simply doing cache > > flush/invalidation through cpu_*() functions ? > > Because busdma is the interface for doing these kinds of operations and n= ot > cpu_xxx(). The DMA memory is allocated using busdma and loaded using > busdma, and cpu_xxx() is not a part of the busdma interface. > > > > Grepping through the source code for ARM, I found a line like this: > > > (...) > > > You see that it only performs purge and no prior flush. This is what > > > needs fixing! If semihalf could go through the "arm/arm/cpufunc.c" fi= le > > > and fix those flush and invalidate functions then many people would > > > become happy! > > > > My developement platform is sheeva. CPU functions for this platform are > > implemented correctly. > > > > > Again, it is not a problem in USB firstly, it is a problem in > > > "arm/xxx". > > > > You are suggesting that the problem is in ARM busdma (and in MIPS busdm= a, > > as on this platform USB also does not work). Could you give me example = of > > platform _without_ hardware coherency where new USB stack simply works ? > > If I find time later today I will fix the cache sync operations for > AT91RM9200 which has a built in OHCI, which has been working fine with my > new USB stack. > > Can you please add some prints to the "bus_dmamap_sync_buf()" code on ARM, > and verify which cases the CPU is entering, without your patch and with > your patch. From my analysis your patch will just change the execution > path: > > With your patch it calls: > > cpu_dcache_wb_range((vm_offset_t)buf, len); > cpu_l2cache_wb_range((vm_offset_t)buf, len); > > Without your patch it calls: > > cpu_dcache_wbinv_range((vm_offset_t)buf, len); > cpu_l2cache_wbinv_range((vm_offset_t)buf, len); > > In my view these execution paths are equivalent with regard to clean or > cache flush. They should both perform a cache flush, which is what the USB > code wants to do, but obviously the latter case is not implemented > correctly, most likely because it doesn't flush the buffer like expected! > > Try adding: > > cpu_dcache_wb_range((vm_offset_t)buf, len); > cpu_l2cache_wb_range((vm_offset_t)buf, len); > > Before: > > cpu_dcache_wbinv_range((vm_offset_t)buf, len); > cpu_l2cache_wbinv_range((vm_offset_t)buf, len); > > In: sys/arm/arm/busdma_machdep.c > Hi, I did not find time yet to test on my AT91RM9200 board. I hope to do a test= =20 this week. Did you at semihalf find anything? =2D-HPS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200907051236.22783.hselasky>