Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Jan 2012 11:14:45 +0000
From:      David Chisnall <theraven@theravensnest.org>
To:        davidxu@FreeBSD.org
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, John Baldwin <jhb@FreeBSD.org>, Bruce Evans <brde@optusnet.com.au>
Subject:   Re: svn commit: r230201 - head/lib/libc/gen
Message-ID:  <DCE1E23E-1F02-4944-ADCC-FB63C1921F91@theravensnest.org>
In-Reply-To: <4F18B951.6080404@gmail.com>
References:  <201201160615.q0G6FE9r019542@svn.freebsd.org> <4F178CDC.3030807@gmail.com> <4F17B0DE.3060008@gmail.com> <201201191023.28426.jhb@freebsd.org> <20120120030456.O1411@besplex.bde.org> <4F18B951.6080404@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 20 Jan 2012, at 00:46, David Xu wrote:

> It depends on hardware, if it is a large machine with lots of cpu,
> a small conflict on dual-core machine can become a large conflict
> on large machine because it is possible more cpus are now
> running same code which becomes a bottleneck. On a large machine
> which has 1024 cores, many code need to be redesigned.

You'll also find that the relative cost of atomic instructions varies a =
lot between CPU models.  Between Core 2 and Sandy Bridge Core i7, the =
relative cost of an atomic add (full barrier) dropped by about two =
thirds.  The cache coherency logic has been significantly improved on =
the newer chips. =20

For portable code, it's worth remembering that ARMv8 (which doesn't =
entirely exist yet) contains a set of barriers that closely match the =
semantics of the C[++]11 memory ordering.  They do this not for =
performance (directly), but for power efficiency - so using the =
least-restrictive required locking will eventually result in code for =
mobile devices that uses less battery power, if it's in a hot path. =20

David=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DCE1E23E-1F02-4944-ADCC-FB63C1921F91>