Date: Mon, 29 Jun 2009 15:16:56 +0200 From: Hans Petter Selasky <hselasky@c2i.net> To: Piotr =?iso-8859-2?q?Zi=EAcik?= <kosmo@semihalf.com>, freebsd-arm@freebsd.org Cc: Rafal Jaworowski <raj@semihalf.com>, thompsa@freebsd.org, freebsd-usb@freebsd.org Subject: Re: CPU Cache and busdma usage in USB Message-ID: <200906291516.58725.hselasky@c2i.net> In-Reply-To: <200906231035.43096.kosmo@semihalf.com> References: <200906231035.43096.kosmo@semihalf.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 23 June 2009 10:35:42 Piotr Zi=EAcik wrote: > --- a/sys/dev/usb/usb_busdma.c > +++ b/sys/dev/usb/usb_busdma.c > @@ -658,8 +658,7 @@ usb_pc_cpu_invalidate(struct usb_page_cache *pc) > /* nothing has been loaded into this page cache! */ > return; > } > - bus_dmamap_sync(pc->tag, pc->map, > - BUS_DMASYNC_POSTWRITE | BUS_DMASYNC_POSTREAD); > + bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREREAD); > } > > /*----------------------------------------------------------------------= =2D- >* @@ -672,8 +671,7 @@ usb_pc_cpu_flush(struct usb_page_cache *pc) > /* nothing has been loaded into this page cache! */ > return; > } > - bus_dmamap_sync(pc->tag, pc->map, > - BUS_DMASYNC_PREWRITE | BUS_DMASYNC_PREREAD); > + bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREWRITE); > } Hi, Let's restart the discussion with the initial patch. You want to change the flags passed to bus_dmamap_sync() so that the=20 flush/invalidate mapping gets right. I understand why your patch makes it=20 work. That's not the problem. In "src/sys/arm/arm/busdma_machdep.c" there is a function called=20 "_bus_dmamap_sync_bp()". If you look at that function you see that it only= =20 triggers on the "BUS_DMASYNC_PREWRITE" and "BUS_DMASYNC_POSTREAD" flags. Af= ter=20 your patching only the PREXXXX flags are used, so if bouce pages are used o= n=20 ARM and x86 and amd64 +++, then only BUS_DMASYNC_PREWRITE will do anything.= =20 This indicates that your patch is not fully correct. Grepping through the source code for ARM, I found a line like this: /*XXX*/ arm9_dcache_wbinv_range, /* dcache_inv_range */ which is not correct. If we are only invalidating, then it is not correct t= o=20 do a flush first. Summed up: static void bus_dmamap_sync_buf(void *buf, int len, bus_dmasync_op_t op) { char _tmp_cl[arm_dcache_align], _tmp_clend[arm_dcache_align]; if ((op & BUS_DMASYNC_PREWRITE) && !(op & BUS_DMASYNC_PREREAD)) { cpu_dcache_wb_range((vm_offset_t)buf, len); cpu_l2cache_wb_range((vm_offset_t)buf, len); } if (op & BUS_DMASYNC_PREREAD) { if (!(op & BUS_DMASYNC_PREWRITE) && ((((vm_offset_t)(buf) | len) & arm_dcache_align_mask) = =3D=3D=20 0)) { cpu_dcache_inv_range((vm_offset_t)buf, len); cpu_l2cache_inv_range((vm_offset_t)buf, len); } else { Because the USB code specifies both PREREAD and PREWRITE we end up in the=20 following case, which is not implemented correctly. The function name=20 indicates write back first, then invalidate, but when looking at the=20 implementation: <example> arm8_cache_purgeID, /* idcache_wbinv_all */ (void *)arm8_cache_purgeID, /* idcache_wbinv_range */ </example> You see that it only performs purge and no prior flush. This is what needs= =20 fixing! If semihalf could go through the "arm/arm/cpufunc.c" file and fix=20 those flush and invalidate functions then many people would become happy! Again, it is not a problem in USB firstly, it is a problem in "arm/xxx". cpu_dcache_wbinv_range((vm_offset_t)buf, len); cpu_l2cache_wbinv_range((vm_offset_t)buf, len); } } if (op & BUS_DMASYNC_POSTREAD) { if ((vm_offset_t)buf & arm_dcache_align_mask) { memcpy(_tmp_cl, (void *)((vm_offset_t)buf & ~ arm_dcache_align_mask), (vm_offset_t)buf & arm_dcache_align_mask); } if (((vm_offset_t)buf + len) & arm_dcache_align_mask) { memcpy(_tmp_clend, (void *)((vm_offset_t)buf + len), arm_dcache_align - (((vm_offset_t)(buf) + len) & arm_dcache_align_mask)); } cpu_dcache_inv_range((vm_offset_t)buf, len); cpu_l2cache_inv_range((vm_offset_t)buf, len); if ((vm_offset_t)buf & arm_dcache_align_mask) memcpy((void *)((vm_offset_t)buf & ~arm_dcache_align_mask), _tmp_cl,=20 (vm_offset_t)buf & arm_dcache_align_mask); if (((vm_offset_t)buf + len) & arm_dcache_align_mask) memcpy((void *)((vm_offset_t)buf + len), _tmp_clend, arm_dcache_align - (((vm_offset_t)(buf) + len) & arm_dcache_align_mask)); } } =2D-HPS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200906291516.58725.hselasky>