From owner-svn-src-all@FreeBSD.ORG Thu Sep 30 14:17:14 2010 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 72008106564A; Thu, 30 Sep 2010 14:17:14 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from tensor.andric.com (cl-327.ede-01.nl.sixxs.net [IPv6:2001:7b8:2ff:146::2]) by mx1.freebsd.org (Postfix) with ESMTP id 018D18FC13; Thu, 30 Sep 2010 14:17:14 +0000 (UTC) Received: from [IPv6:2001:7b8:3a7:0:f59a:54a5:56c:5c84] (unknown [IPv6:2001:7b8:3a7:0:f59a:54a5:56c:5c84]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id 22C095C43; Thu, 30 Sep 2010 16:17:12 +0200 (CEST) Message-ID: <4CA49BE9.8040602@FreeBSD.org> Date: Thu, 30 Sep 2010 16:17:13 +0200 From: Dimitry Andric Organization: The FreeBSD Project User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.11pre) Gecko/20100929 Lanikai/3.1.5pre MIME-Version: 1.0 To: Bruce Evans References: <201009292120.o8TLKTSf022159@svn.freebsd.org> <201009291812.26796.jkim@FreeBSD.org> <20100930125731.B2324@delplex.bde.org> In-Reply-To: <20100930125731.B2324@delplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Jung-uk Kim Subject: Re: svn commit: r213281 - head/lib/libc/amd64/gen X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 14:17:14 -0000 On 2010-09-30 05:46, Bruce Evans wrote: ... > This file probably shouldn't exist, especially on amd64. There are 4 or 5 > versions of ldexp(), and this file implements what seems to be the worst > one, even without the bug. > > First, it shouldn't exist since it is a libm function. It exists for the > historical reason that its object file has always been in libc. This > causes organizational problems. It also makes it impossible to throw it out of libc, as there are many applications that expect it in there. Luckily, it seems to be only ldexp for which this is the case, not ldexpl or ldexpf. :) > The second version is in fdlibm. This wasn't imported into FreeBSD. It > calls scalbn() after checking some cases. I think it shouldn't check > anything. In FreeBSD it could be a weak alias to scalbn(). > > The third version is in fdlibm. This one is named scalbn(). FreeBSD has > it. FreeBSD aliases ldexpl() to scalbn() iff long doubles are the same as > doubles. FreeBSD also has scalbnf(). This came from NetBSD/Cygnus's > extension of fdlibm. FreeBSD aliases ldexpf() to scalbnf() (or is it > the other way?). We alias scalbnf() to ldexpf(), apparently. > The fourth version is in the FreeBSD arch-dependent directories of > lib/msun for at least amd64 and i386. These are also named scalbn(). > These aren't in fdlibm, but came from NetBSD. These are written in > non-inline asm and are similar to the ones in libc. They are a couple > of instructions shorter, due to never using a frame pointer (unless > profiling) and avoiding an fxch or two. They aren't aliased to aything, > and don't have float versions. > > The fifth version, which might not exist, is gcc's builtin. I think it > doesn't really exist, but gcc says it has a builtin ldexp() and I had to > fight with this to test this. gcc normally made the dubious optimization > of moving ldexp() out of a test loop. But ldexp() has side effects. The version in libc/gen/ldexp.c is just a copy of msun/src/s_scalbn.c, with some things like copysign() directly pasted in. It even has: /* @(#)fdlibm.h 5.1 93/09/24 */ at the top. > Testing indicates that the fdlibm C version is 2.5 times faster than the > asm versions on amd64 on a core2 (ref9), while on i386 the C version is > only 1.5 times faster. The C code is a bit larger so benefits more from > being called from a loop. The asm code uses a slow i387 instruction, and > on i387 it hhs to do expensive moves from xmm registers to i387 ones and > back. > > Times for 100 million calls: > > amd64 libc ldexp: 3.18 seconds > amd64 libm asm scalbn: 2.96 > amd64 libm C scalbn: 1.30 > i386 libc ldexp: 3.13 > i386 libm asm scalbn: 2.86 > i386 libm C scalbn: 2.11 Seeing these results, I propose to just delete lib/libc/amd64/gen/ldexp.c and lib/libc/i386/gen/ldexp.c, which will cause the amd64 and i386 builds to automatically pick up lib/libc/gen/ldexp.c instead, which effectively is the fdlibm implementation. (And no more clang workarounds needed. :)