Date: Tue, 11 Nov 2008 12:14:21 -0500 From: John Baldwin <jhb@freebsd.org> To: "Attilio Rao" <attilio@freebsd.org> Cc: src-committers@freebsd.org, Kip Macy <kmacy@freebsd.org>, svn-src-user@freebsd.org Subject: Re: svn commit: r184759 - user/kmacy/HEAD_fast_multi_xmit/sys/net Message-ID: <200811111214.21288.jhb@freebsd.org> In-Reply-To: <3bbf2fe10811101440j26351593taccd2654f0ef4374@mail.gmail.com> References: <200811080202.mA822D0W098283@svn.freebsd.org> <200811101647.12258.jhb@freebsd.org> <3bbf2fe10811101440j26351593taccd2654f0ef4374@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 10 November 2008 05:40:41 pm Attilio Rao wrote: > 2008/11/10, John Baldwin <jhb@freebsd.org>: > > On Saturday 08 November 2008 11:53:53 am Attilio Rao wrote: > > > 2008/11/8, David Schultz <das@freebsd.org>: > > > > On Sat, Nov 08, 2008, Attilio Rao wrote: > > > > > Definitively, I'm not sure we need this. > > > > > We alredy have memory barriers you could exploit which just require a > > > > > "dummy" object. > > > > > > > > > > For example you could do: > > > > > flowtable_pcpu_unlock(struct flowtable *table, uint32_t hash) > > > > > { > > > > > > > > > > (void)atomic_load_acq_ptr(&dummy); > > > > > ... > > > > > > > > > > > > Memory barriers are cheaper than atomic ops. > > > > > > But this is an atomic op too. > > > > > > > Furthermore, there's different types of memory barriers > > > > (store/store, load/store, etc.), not just a generic mb(). Some > > > > architectures like sparc64 define all four, but only actually > > > > implement the varieties that are useful in improving performance. > > > > Take a look at what Solaris has here. > > > > > > > > I'm skeptical of trying to play clever tricks with these things > > > > outside of the code that implements synchronization > > > > primitives. Memory ordering is very hard to reason about, and we > > > > already have a lot of code, e.g., in libthr, that isn't correct > > > > under weak memory ordering. Moreover, the compiler can reorder > > > > loads and stores, and that just adds a whole new level of pain. > > > > > > _acq prefix is intended to not let reordering happening really. > > > man 9 atomic can explain how the acq and rel memory barriers work. > > > > > > _acq is not a full barrier, it's more of an 'lfence'. The mb() here is doing > > more of a _rel barrier ('sfence', etc.). > > Sure but the comment is still valid. > I don't see the point of such things when you can implement barriers > trough our atomic_* stuff. atomic_* stuff works when you are already doing a store. Doing a "dummy" store is quite a hack just to get a standalone memory barrier. There is a reason ia64 includes "acq" and "rel" variants of various instructions as well as a standalone "fence". One problem with Kip's change though is that it doesn't work on older x86 CPUs that don't have "sfence" (pre-PIII IIRC). I'm not sure if some of the lower-power x86 CPUs such as VIA, etc. support "sfence" either, though I think those are typically used in single-CPU setups. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200811111214.21288.jhb>