From owner-freebsd-current@FreeBSD.ORG  Thu Jul 26 02:19:37 2012
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CA13E106566B;
	Thu, 26 Jul 2012 02:19:37 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au
	[211.29.132.190])
	by mx1.freebsd.org (Postfix) with ESMTP id 58FE58FC1E;
	Thu, 26 Jul 2012 02:19:37 +0000 (UTC)
Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au
	(c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246])
	by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q6Q2JRBj005609
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 26 Jul 2012 12:19:28 +1000
Date: Thu, 26 Jul 2012 12:19:27 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <50102EF4.2080601@missouri.edu>
Message-ID: <20120726121848.I1093@besplex.bde.org>
References: <210816F0-7ED7-4481-ABFF-C94A700A3EA0@bsdimp.com>
	<20120708233624.GA53462@troutmask.apl.washington.edu>
	<4FFBF16D.2030007@gwdg.de>
	<2A1DE516-ABB4-49D7-8C3D-2C4DA2D9FCF5@bsdimp.com>
	<4FFC412B.4090202@gwdg.de>
	<20120710151115.GA56950@zim.MIT.EDU> <4FFC5E5D.8000502@gwdg.de>
	<20120710225801.GB58778@zim.MIT.EDU> <50101EDE.6030509@gwdg.de>
	<50102C8F.2080901@missouri.edu>
	<20120725173147.GA72824@troutmask.apl.washington.edu>
	<50102EF4.2080601@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Mailman-Approved-At: Thu, 26 Jul 2012 02:39:47 +0000
Cc: freebsd-current@FreeBSD.org, David Schultz <das@FreeBSD.org>,
	Bruce Evans <bde@FreeBSD.org>,
	Steve Kargl <sgk@troutmask.apl.washington.edu>
Subject: Re: Use of C99 extra long double math functions after r236148
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Jul 2012 02:19:38 -0000

On Wed, 25 Jul 2012, Stephen Montgomery-Smith wrote:

> On 07/25/12 12:31, Steve Kargl wrote:
>> On Wed, Jul 25, 2012 at 12:27:43PM -0500, Stephen Montgomery-Smith wrote:
>>> Just as a point of comparison, here is the answer computed using
>>> Mathematica:
>>> 
>>> N[Exp[2], 50]
>>> 7.3890560989306502272304274605750078131803155705518
>>> 
>>> As you can see, the expl solution has only a few digits more accuracy
>>> that exp.
>> 
>> Unless you are using sparc64 hardware.
>> 
>> flame:kargl[204] ./testl -V 2
>> ULP = 0.2670 for x = 2.000000000000000000000000000000000e+00
>> mpfr exp: 7.389056098930650227230427460575008e+00
>> libm exp: 7.389056098930650227230427460575008e+00
>
Yes.  It would be nice if long on the Intel was as long as the sparc64.

You want it to be as slow as sparc64?  (About 300 times slower, after
scaling the CPU clock rates.  Doubles on sparc64 are less than 2 times
slower.)

I forgot to mention in a previous reply is that expl has only a few
more decimal digits of accuracy than exp because the extra precision
on x86 wasn't designed to give much more accuracy.  It was designed
to give more chance of full double precision accuracy in naive code.
It was designed in ~1980 when bits were expensive and the extra 11
provided by the 8087 were considered the best tradeoff between cost
and accuracy.  They only previde 2-3 extra decimal digits of accuracy.
They are best thought of as guard bits.  Floating point uses 1 or 2
guard bits internally.  11 extends that significantly and externalizes
it, but is far from doubling the number of bits.  Their use to provide
extra precision was mostly defeated in C by bad C bindings and
implementations.  This was consolidated by my not using the extra bits
for the default rounding precision in FreeBSD.  This has been further
consolidated by SSE not supporting extended precision.  Now the naive
code that uses doubles never gets the extra precision on amd64.  Mixing
of long doubles with doubles is much slower with SSE+i387 than with
i387, since the long doubles are handled in different registers and
must be translated with SSE+i387, while with i387, using long doubles
is almost free (it actually has a negative cost in non-naive code since
it allows avoiding extra precision in software).  Thus SSE also inhibits
using the extra precision intentionally.

Bruce