Date: Fri, 27 Mar 2015 22:40:57 +0100 From: Jilles Tjoelker <jilles@stack.nl> To: Eric van Gyzen <vangyzen@FreeBSD.org> Cc: current@FreeBSD.org Subject: Re: SSE in libthr Message-ID: <20150327214057.GA3766@stack.nl> In-Reply-To: <5515AED9.8040408@FreeBSD.org> References: <5515AED9.8040408@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 27, 2015 at 03:26:17PM -0400, Eric van Gyzen wrote: > In a nutshell: > Clang emits SSE instructions on amd64 in the common path of > pthread_mutex_unlock. This reduces performance by a non-trivial > amount. I'd like to disable SSE in libthr. How about saving and restoring the FPU/SSE state eagerly instead of the current CR0.TS-based lazy method? There is overhead associated with #NM exception handling (fpudna) which is not worth it if FPU/SSE are used often. This would apply to userland threads only; kernel threads normally do not use FPU/SSE and handle the FPU/SSE state manually if they do. There is performance improvement potential in using SSE for optimizing string functions, for example. Even a simple SSE2 strlen easily outperforms the already optimized lib/libc/string/strlen.c in a microbenchmark, and many other string functions are slow byte-at-a-time implementations. -- Jilles Tjoelker
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150327214057.GA3766>