Date: Sat, 28 Mar 2015 15:21:07 +0000 From: David Chisnall <theraven@FreeBSD.org> To: Julian Elischer <julian@freebsd.org> Cc: freebsd-current@freebsd.org Subject: Re: SSE in libthr Message-ID: <FDC008DE-6B3A-4D02-A250-67DEFB1E0B1D@FreeBSD.org> In-Reply-To: <5516B280.6060002@freebsd.org> References: <5515AED9.8040408@FreeBSD.org> <3A96AAEC-9C1C-444E-9A73-3CD2AED33116@me.com> <20150327214452.GR2379@kib.kiev.ua> <5516B280.6060002@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 28 Mar 2015, at 13:54, Julian Elischer <julian@freebsd.org> wrote: >=20 > the point is that clang will do this anywhere it can, because it isn't = taking into account the > side effects, just the speed of the commands themselves. This is also something that is not going to decrease. Clang now enables = the SLP vectoriser by default and this code is constantly being = improved. Current generation vector units are explicitly designed as = targets for compiler autovectorisation, not for hand-tuned DSP code = (which, increasingly, runs on the GPU anyway). This means that we're = increasingly going to see SSE/AVX/NEON usage in CPU-bound code, even = without an explicit programmer decision to do so. Optimising for the = case when the vector unit is not used is about as sensible as optimising = for the single-core case: it will affect some people, but generally not = those who care about performance, and a decreasing number of people over = time. David
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FDC008DE-6B3A-4D02-A250-67DEFB1E0B1D>