Date: Mon, 7 Sep 2015 12:55:27 -0700 From: Leonardo Fogel <leonardofogel@yahoo.com.br> To: freebsd-drivers@freebsd.org Subject: Re: Memory barrier Message-ID: <1441655727.36257.YahooMailBasic@web120802.mail.ne1.yahoo.com> In-Reply-To: <20150906180311.GS2072@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
> > Case 1: > > bus_write_1(region_0, ...); > > /* barrier here */ > > DELAY(some_time); > > > > Case 2: > > bus_write_1(region_0, ...); > > /* barrier here */ > > bus_write_1(region_2, ...); > > > > In the first one, I want the write to reach the device before the threa= d busy-waits. > > > > In the second one, I want the write to a device (e.g. power management)= to > > complete before the write to another starts/completes. >=20 > I believe that the bus_write semantic includes the required serialization= . > E.g., on x86 all CPU write buffers are flushed before the write instructi= on > is declared completed, because this is the semantic of the uncacheable > memory. For powerpc, the system automatically inserts powerpc_iomb() aft= er > the write, which is full sync. I am not aware of other architectures. I've found the implementation of the bus_space_barrier for the ARM architec= ture (the one in which I'm interested): generic_bs_barrier(bus_space_tag_t t, bus_space_handle_t bsh, bus_size_t= offset, bus_size_t len, int flags) { /* * dsb() will drain the L1 write buffer and establish a memory a= ccess * barrier point on platforms where that has meaning. On a writ= e we * also need to drain the L2 write buffer, because most on-chip = memory * mapped devices are downstream of the L2 cache. Note that thi= s needs * to be done even for memory mapped as Device type, because whi= le * Device memory is not cached, writes to it are still buffered. */ dsb(); if (flags & BUS_SPACE_BARRIER_WRITE) { cpu_l2cache_drain_writebuf(); } } The ARM architecture specifies two _data_ barrier instructions: DMB and DSB= . The first synchronizes memory accesses, and the second synchronizes both = memory accesses and instruction execution. So, DSB is the answer to Case 1,= and DMB or DSB is the answer to Case 2. The implementation above brings something of which I was not aware: it also= drains the L2 write buffer. Older implementations of the "PL310 Store Buff= er did not have any automatic draining mechanism." (ARM CoreLink Level 2 Ca= che Controller (L2C-310 or PL310), r3 releases, Software Developers Errata = Notice.) In newer implementations, the writes to device memory are "Put in = store buffer, not merged, immediately drained to L3." (CoreLink Level 2 Cac= he Controller L2C-310 Technical Reference Manual=09Revision: r3p3.) Leonardo
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1441655727.36257.YahooMailBasic>