Date: Sun, 26 Aug 2012 17:13:31 -0600 From: Warner Losh <imp@bsdimp.com> To: Ian Lepore <freebsd@damnhippie.dyndns.org> Cc: freebsd-arm@freebsd.org, freebsd-arch@freebsd.org, Mark Tinguely <marktinguely@gmail.com>, freebsd-mips@freebsd.org, Hans Petter Selasky <hans.petter.selasky@bitfrost.no> Subject: Re: Partial cacheline flush problems on ARM and MIPS Message-ID: <10307B47-13F3-45C0-87F7-66FD3ACA3F86@bsdimp.com> In-Reply-To: <1346005507.1140.69.camel@revolution.hippie.lan> References: <1345757300.27688.535.camel@revolution.hippie.lan> <3A08EB08-2BBF-4B0F-97F2-A3264754C4B7@bsdimp.com> <1345763393.27688.578.camel@revolution.hippie.lan> <FD8DC82C-AD3B-4EBC-A625-62A37B9ECBF1@bsdimp.com> <1345765503.27688.602.camel@revolution.hippie.lan> <CAJ-VmonOwgR7TNuYGtTOhAbgz-opti_MRJgc8G%2BB9xB3NvPFJQ@mail.gmail.com> <1345766109.27688.606.camel@revolution.hippie.lan> <CAJ-VmomFhqV5rTDf-kKQfbSuW7SSiSnqPEjGPtxWjaHFA046kQ@mail.gmail.com> <F8C9E811-8597-4ED0-9F9D-786EB2301D6F@bsdimp.com> <1346002922.1140.56.camel@revolution.hippie.lan> <CAP%2BM-_HZ4yARwZA2koPJDeJWHT-1LORupjymuVnMtLBzeXe=DA@mail.gmail.com> <1346005507.1140.69.camel@revolution.hippie.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 26, 2012, at 12:25 PM, Ian Lepore wrote: > On Sun, 2012-08-26 at 13:05 -0500, Mark Tinguely wrote: >> I did a quick look at the drivers last summer. >>=20 >> Most drivers do the right thing and use memory allocated from >> bus_dmamem_alloc(). It is easy for us to give them a cache aligned >> buffer. >>=20 >> Some drivers use mbufs - 256 bytes which cache safe. >>=20 >> Some drivers directly or indirectly malloc() a buffer and then use it >> to dma - rather than try to fix them all, I was okay with making the >> smallest malloc() amount equal to the cache line size. It amounts to >> getting rid of the 16 byte allocation on some ARM architectures. The >> power of 2 allocator will then give us cache line safe allocation. >>=20 >> A few drivers take a small memory amount from the kernel stack and = dma >> to it <- broken driver. >>=20 >> The few drivers that use data from a structure and that memory is not >> cached aligned <- broken driver. >>=20 >=20 > I disagree about those last two points -- drivers that choose to use > stack memory or malloc'd memory as IO buffers are not broken. Stack DMA is bad policy, at best, and broken at worst. The reason is = because of alignment of the underlying unit. Since there's no way to = say that something is aligned to a given spot on the stack, you are = asking for random stack corruption. Also, malloced area is similarly problematic: There's no cache line = informing of the allocator, so you can wind up with an allocation of = memory that's corrupted due to cache effects. > Drivers > can do IO directly to/from userland buffers, do we say that an > application that calls read(2) and passes the address of a stack > variable is broken? Yes, if it is smaller than a cache line size, and not aligned to the = cache line. That's the point of the uio load variant. > In this regard, it's the busdma implementation that's broken, because = it > should bounce those IOs through a DMA-safe buffer. There's absolutely > no rule that I've ever heard of in FreeBSD that says IO can only take > place using memory allocated from busdma. That's partially true. Since BUSDMA grew up in the storage area, you = must allocate the memory from busdma, or it must be page aligned has = been the de-facto rule here. The mbuf and uio variants of load were = invented to cope with common cases of mbufs and user I/O to properly = flag things. How does busdma know that it is using memory that's not from its = allocator? > The rule is only that the > proper sequence of busdma operation must be called, and beyond that = it's > up to the busdma implementation to make it work. =20 No. Bouncing is needed due to poor alignment of the underlying device. = Not due to cache effects. There's a limited number of things that we support with busdma. = Arbitrary data from malloc that might be shared with the CPU isn't on = that list. > Our biggest problem, I think, is that we don't have a sufficient > definition of "the proper sequence of busdma operations." I disagree. The sequence has been known for a long time. > I don't think it will be very hard to make the arm and mips busdma > implementations work correctly. It won't even be too hard to make = them > fairly efficient at bouncing small IOs (my thinking is that we can = make > small bounces no more expensive than the current partial cacheline = flush > implementation which copies the data multiple times). Bouncing large = IO > will never be efficient, but the inefficiency will be a powerful > motivator to update drivers that do large IO to work better, such as > using buffers allocated from busdma. I don't think the cache line problem can be solved with bounce buffers. = Trying to accommodate broken drivers is what lead us to this spot. We = need to fix the broken drivers. If that's impossible, then the best we = can do is have the driver set a 'always bounce' flag in the tag it = creates and use that to always bounce for operations through that tag. Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?10307B47-13F3-45C0-87F7-66FD3ACA3F86>