Date: Fri, 20 Jan 2012 11:14:45 +0000 From: David Chisnall <theraven@theravensnest.org> To: davidxu@FreeBSD.org Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, John Baldwin <jhb@FreeBSD.org>, Bruce Evans <brde@optusnet.com.au> Subject: Re: svn commit: r230201 - head/lib/libc/gen Message-ID: <DCE1E23E-1F02-4944-ADCC-FB63C1921F91@theravensnest.org> In-Reply-To: <4F18B951.6080404@gmail.com> References: <201201160615.q0G6FE9r019542@svn.freebsd.org> <4F178CDC.3030807@gmail.com> <4F17B0DE.3060008@gmail.com> <201201191023.28426.jhb@freebsd.org> <20120120030456.O1411@besplex.bde.org> <4F18B951.6080404@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 20 Jan 2012, at 00:46, David Xu wrote: > It depends on hardware, if it is a large machine with lots of cpu, > a small conflict on dual-core machine can become a large conflict > on large machine because it is possible more cpus are now > running same code which becomes a bottleneck. On a large machine > which has 1024 cores, many code need to be redesigned. You'll also find that the relative cost of atomic instructions varies a = lot between CPU models. Between Core 2 and Sandy Bridge Core i7, the = relative cost of an atomic add (full barrier) dropped by about two = thirds. The cache coherency logic has been significantly improved on = the newer chips. =20 For portable code, it's worth remembering that ARMv8 (which doesn't = entirely exist yet) contains a set of barriers that closely match the = semantics of the C[++]11 memory ordering. They do this not for = performance (directly), but for power efficiency - so using the = least-restrictive required locking will eventually result in code for = mobile devices that uses less battery power, if it's in a hot path. =20 David=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DCE1E23E-1F02-4944-ADCC-FB63C1921F91>