Date: Sun, 26 Aug 2012 17:03:46 -0600 From: Warner Losh <imp@bsdimp.com> To: Ian Lepore <freebsd@damnhippie.dyndns.org> Cc: Hans Petter Selasky <hans.petter.selasky@bitfrost.no>, freebsd-arm@freebsd.org, freebsd-mips@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Partial cacheline flush problems on ARM and MIPS Message-ID: <6D83AF9D-577B-4C83-84B7-C4E3B32695FC@bsdimp.com> In-Reply-To: <1346002922.1140.56.camel@revolution.hippie.lan> References: <1345757300.27688.535.camel@revolution.hippie.lan> <3A08EB08-2BBF-4B0F-97F2-A3264754C4B7@bsdimp.com> <1345763393.27688.578.camel@revolution.hippie.lan> <FD8DC82C-AD3B-4EBC-A625-62A37B9ECBF1@bsdimp.com> <1345765503.27688.602.camel@revolution.hippie.lan> <CAJ-VmonOwgR7TNuYGtTOhAbgz-opti_MRJgc8G%2BB9xB3NvPFJQ@mail.gmail.com> <1345766109.27688.606.camel@revolution.hippie.lan> <CAJ-VmomFhqV5rTDf-kKQfbSuW7SSiSnqPEjGPtxWjaHFA046kQ@mail.gmail.com> <F8C9E811-8597-4ED0-9F9D-786EB2301D6F@bsdimp.com> <1346002922.1140.56.camel@revolution.hippie.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 26, 2012, at 11:42 AM, Ian Lepore wrote: > On Thu, 2012-08-23 at 22:00 -0600, Warner Losh wrote: >> The bottom line is that you can't mix things like that when cache >> lines are involved. The current code that tries is doomed to = failure. >> Doomed. You just can't control all flushes, as Ian's missive >> demonstrates, and trying to accommodate code that does this I don't >> think can possibly work. All the interrupt masking, copying in and >> out, etc I fear is doomed to utter and abject failure. =20 >>=20 > Until last weekend I was in the camp that thought the partial = cacheline > flush problem was solvable with sufficiently clever code. Now I agree > that we're doomed to failure and it's time to try another direction. >=20 > We're going to have some implementation work to do in arm and mips > busdma, but I think the larger part of the task is going to be = defining > more rigorously how a driver must interact with the busdma system to > function correctly on all types of platforms, and then update existing > drivers to conform. >=20 > The busdma manpage currently has some vague words about the usage and > sequencing of sync ops, such as "If read and write operations are not > preceded and followed by the appropriate synchronization operations, > behavior is undefined." I think we should more explicitly spell out > what the appropriate sequences are. In particular: >=20 > * The PRE and POST operations must occur in pairs; a PREREAD must > be followed eventually by a POSTREAD and a PREWRITE must be > followed by a POSTWRITE.=20 PREREAD means "I am about to tell the device to put data here, have = whaterver things might be pending in the CPU complex to get out of the = way." usually this means 'invalidate the cache for that range', but not = always. POSTREAD means 'The device's DMA is done, I'd like to start = accessing it now.' If the memory will be thrown away without being = looked at, then does the driver necessarily need to issue the POSTREAD? = I think so, but I don't know if that's a new requirement. > * The CPU is not allowed to access the mapped memory after a PRE > sync and before the corresponding POST sync. =20 Correct. > * The DMA hardware is not allowed to access the mapped memory > after a POST sync and before the next PRE sync.=20 Correct. > * Read and write sync operators may be combined in a single call, > PRE and POST operators may not be. E.G., PREREAD|PREWRITE is > allowed, PREREAD|POSTREAD is not. We should note that while > read and write operations may be combined, on some platforms > PREREAD|PREWRITE is needlessly expensive when only a read is > being performed. Correct. > We also need some rules about working with buffers obtained from > bus_dmamem_alloc() and external buffers passed to bus_dmamap_load(). = I > think the rule should be that a buffer obtained from = bus_dmamem_alloc(), > or more formally any region of memory mapped by a bus_dmamap_load(), = is > a single logical object which can only be accessed by one entity at a > time. That means that there cannot be two concurrent DMA operations > happening in different regions of the same buffer, nor can DMA and CPU > access be happening concurrently even if in different parts of the > buffer. =20 There's something subtle that I'm missing. Why would two DMA operations = be disallowed? The rest makes good sense. > I've always thought that allocating a dma buffer feels like a big > hassle. You sometimes have to create a tag for the sole purpose of > setting the maxsize to get the buffer size you need when you call > bus_dmamem_alloc(). If bus_dmamem_alloc() took a size parm you could > just use your parent tag, or a generic tag appropriate to all the IO > you're doing for a given device. If you need a variety of buffers for > small control and command and status transfers of different sizes, you > end up having to manage up to a dozen tags and maps and buffers. It's > all very clunky and inconvenient. It's just the sort of thing that > makes you want to allocate a big buffer and subdivide it. Surely we > could do something to make it easier? You'd wind up creating a quick tag on the fly for the bus_dmamap_alloc = if you wanted to do this. Cleanup then becomes unclear. Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6D83AF9D-577B-4C83-84B7-C4E3B32695FC>