Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Nov 2012 04:01:53 -0800 (PST)
From:      "Rodney W. Grimes" <freebsd@pdx.rh.CN85.ChatUSA.com>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Juli Mallett <juli@clockworksquid.com>, "freebsd-mips@FreeBSD.org" <freebsd-mips@freebsd.org>
Subject:   Re: CACHE_LINE_SIZE macro.
Message-ID:  <201211051201.qA5C1rIo094612@pdx.rh.CN85.ChatUSA.com>
In-Reply-To: <B4225C25-BD43-423C-A1A2-C9FD4AC92ECB@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> 
> On Nov 5, 2012, at 10:01 AM, Eitan Adler wrote:
> 
> > On 5 November 2012 11:49, Warner Losh <imp@bsdimp.com> wrote:
> >>> There has been some discussion recently about padding lock mutexs to
> >>> the cache line size in order to avoid false sharing of CPUs. Some have
> >>> claimed to see significant performance increases as a result.
> >> 
> >> Is that an out-of-kernel interface?
> >> 
> >> If we did that, we'd have to make it run-time settable, because there's no one right answer for arm and MIPS cpus: they are all different.
> > 
> > The discussion ended up with using a special parameter
> > CACHE_LINE_SIZE_LOCKS which is different than CACHE_LINE_SIZE. This is
> > necessary for other reasons as well (CACHE_LINE_SIZE_LOCKS may take
> > into account prefetching of cache lines, but CACHE_LINE_SIZE
> > wouldn't).
> > 
> > I think the "correct" thing to do here is choose a reasonable, but
> > not-always-correct CACHE_LINE_SIZE_LOCKS and make CACHE_LINE_SIZE a
> > per-board constant (or run time setting, or whatever works).  You
> > can't make it run-time settable as the padding is part of the ABI:
> > 
> > For more details see
> > http://comments.gmane.org/gmane.os.freebsd.devel.cvs/483696
> > which contains the original discussion.
> > 
> > Note - I was not involved.
> 
> this is a kernel-only interface, so compile time constants are fine there.  What user-land visible interfaces are affected by this setting?  The answer should be 'none'
> 


Even in a kernel-only interface the answer should be ``none''.  First my gut tells me that this is a synthetic bench mark anomoly and that the data is hiding the real cause and effects of what is going on.  A better fix, if the data can be shown to be real, and meaningful, is to make the mutex type opaque (void *), and pass pointers around to it, and let the run time code decide how to proper cache align allocate the mutex (Probably fit 2 mutexes in 1 8 byte line and I bet it runs faster than there current ``pad it to a cache line size'' hack.   I got a feeling your gona cache thrash on SMP no mater what you do....

I did not see -any- data presented that showed how this was proven to be of benifit.

Why not just go out and cache align every data structure in the kernel.... :-)   A benchmark well show an improvement I am sure of that.
This may actually be begging for the old technique of carefull handcrafted structs that are such that things LIKE mutexes naturally end up
on a proper boundary.   I bet you the boundary is actually 4 byte, independent of cache line size.


-- 
Rod Grimes                                                 freebsd@freebsd.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201211051201.qA5C1rIo094612>