Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Aug 2012 21:56:21 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Ian Lepore <freebsd@damnhippie.dyndns.org>
Cc:        freebsd-arm@freebsd.org, freebsd-mips@freebsd.org, freebsd-arch@freebsd.org
Subject:   Re: Partial cacheline flush problems on ARM and MIPS
Message-ID:  <D1D31F7F-5A70-4139-85DC-D5A931BE0233@bsdimp.com>
In-Reply-To: <1345765503.27688.602.camel@revolution.hippie.lan>
References:  <1345757300.27688.535.camel@revolution.hippie.lan> <3A08EB08-2BBF-4B0F-97F2-A3264754C4B7@bsdimp.com> <1345763393.27688.578.camel@revolution.hippie.lan> <FD8DC82C-AD3B-4EBC-A625-62A37B9ECBF1@bsdimp.com> <1345765503.27688.602.camel@revolution.hippie.lan>

next in thread | previous in thread | raw e-mail | index | archive | help

On Aug 23, 2012, at 5:45 PM, Ian Lepore wrote:

> On Thu, 2012-08-23 at 17:26 -0600, Warner Losh wrote:
>>> On Aug 23, 2012, at 3:28 PM, Ian Lepore wrote:
>>> Now we have a new type of constraint, I think of it as =
"granularity".
>>> In effect, we have a DMA system that can only do DMA in cacheline =
sized
>>> chunks.  Even when the IO size -- and thus the number of "bits on =
the
>>> wire" -- is less than the cacheline size, at the end of the DMA
>>> operation (which includes the software-assisted coherency =
operations)
>>> the number of bytes in memory that may be modified is the size of a
>>> cacheline.  This is because "the DMA system" is not just the engine =
that
>>> moves bytes around, it's the combination of hardware and software =
that
>>> work together to maintain cache coherency.
>> But this isn't new.  It is an alignment requirement, which carries
>> with it an implicit size requirement.  If you enforce the alignment,
>> and force all 'sub buffers' to have this alignment, you don't need =
the
>> new thing.=20
>=20
> So do you think it's safe to assume that any given dma tag that has an
> alignment constraint also implicitly has a buffer size constraint that
> the size must be a multiple of the alignment?

Yes.  If something must be aligned to N bits, chances are it doesn't =
decode the lower N bits which implies a size constraint.

> What if we have a platform with a 32-byte cacheline / DMA granularity,
> and then we have a builtin device on that SoC which can only do DMA on =
a
> 64K alignment (which its tag would reflect), but the hardware can move
> as little as 1 byte at a time?  Children of that bridge device come
> along and allocate little 16-byte buffers that eat 16 pages each.  It
> doesn't seem all that far-fetched to me.

This would be a very odd hardware.  DMA aligned to 64k that can only =
move one byte seems far fetched.  How useful would such a design be?  =
How would you do scatter gather on such a design?

But this isn't what I'm saying.  If the cache line size is 32, then for =
DMA we only ever give out chunks of 32 or larger.  In that case, the =
split cache line situation you gave as an example can't happen.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D1D31F7F-5A70-4139-85DC-D5A931BE0233>