Date: Wed, 08 Jan 2014 10:50:07 -0500 From: Nathan Whitehorn <nwhitehorn@freebsd.org> To: Ian Lepore <ian@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r260440 - head/sys/arm/conf Message-ID: <52CD73AF.1030205@freebsd.org> In-Reply-To: <1389194394.1158.362.camel@revolution.hippie.lan> References: <201401080340.s083eIDG054652@svn.freebsd.org> <52CCD1DA.7010008@freebsd.org> <1389194394.1158.362.camel@revolution.hippie.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On 01/08/14 10:19, Ian Lepore wrote: > On Tue, 2014-01-07 at 23:19 -0500, Nathan Whitehorn wrote: >> On 01/07/14 22:40, Ian Lepore wrote: >>> Author: ian >>> Date: Wed Jan 8 03:40:18 2014 >>> New Revision: 260440 >>> URL: http://svnweb.freebsd.org/changeset/base/260440 >>> >>> Log: >>> Add option USB_HOST_ALIGN to configs that contain 'device usb'. Setting >>> this to the cache line size is required to avoid data corruption on armv4 >>> and armv5, and improves performance on armv6, in both cases by avoiding >>> partial cacheline flushes for USB IO. >>> >>> All these configs already exist in 10-stable. A few that don't (and >>> thus can't be MFC'd yet) will be committed separately. >>> >> There has to be -- and I do not mean this as a criticism of your patch >> -- a better solution to this problem than USB_HOST_ALIGN. Isn't busdma >> supposed to handle this kind of thing? Why is USB different? >> -Nathan >> > USB is different because it doesn't follow the busdma rules. It > allocates one large buffer, then sub-divides it internally into bits > that are used for DMA IO and adjacent bits that are accessed by the cpu > concurrently with the DMA. If it doesn't do that subdividing with an > awareness of the cache line boundaries, it ends up with concurrent CPU > and DMA access to data in the same cache line, and there's no way a > software-assisted cache coherency scheme can reliably do busdma sync ops > that don't corrupt either the CPU data or the DMA data. > > On armv6 we now automatically bounce IO that's not sized and aligned on > cache line boundaries. The overhead for doing so is non-trivial, doubly > so in the case of USB, because it's the only consumer of busdma in the > system that requires that the offset-within-page for a bounced IO be the > same as the offset in the original page (so a pool of small bounce > buffers for small unligned IOs is not an option, it must allocate full > bounce pages for every IO). > > It used to be (on armv4) that when you used the busdma alloc functions > to allocate small DMA buffers (a few bytes) the implementation allocated > entire pages, which is pretty inefficient and can add up to a lot of > allocation overhead. That was cited as a reason not to change USB's > "allocate big then subdivide" scheme. I wrote new busdma allocators > that use UMA pools to efficiently handle small aligned buffers of both > normal and uncachable (BUSDMA_COHERENT) memory, so that's not a > roadblock anymore. (Arm uses the new allocator, mips never got > converted.) > > So, since we keep getting reports on arm@ of data corruption that shows > up as 32-byte chunks of bad data, and it costs real time and resources > to try to debug each case, I figured we should just go with the fix that > nobody likes but it actually works. > > -- Ian > > Thanks for the explanation, the debugging, and the fix. This seems like a straightforward bug in the USB stack. Can it be fixed, or are there architectural reasons why it is the way it is? -Nathan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52CD73AF.1030205>