From owner-freebsd-current@FreeBSD.ORG Thu Jul 26 02:19:37 2012 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA13E106566B; Thu, 26 Jul 2012 02:19:37 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id 58FE58FC1E; Thu, 26 Jul 2012 02:19:37 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6Q2JRBj005609 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 26 Jul 2012 12:19:28 +1000 Date: Thu, 26 Jul 2012 12:19:27 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50102EF4.2080601@missouri.edu> Message-ID: <20120726121848.I1093@besplex.bde.org> References: <210816F0-7ED7-4481-ABFF-C94A700A3EA0@bsdimp.com> <20120708233624.GA53462@troutmask.apl.washington.edu> <4FFBF16D.2030007@gwdg.de> <2A1DE516-ABB4-49D7-8C3D-2C4DA2D9FCF5@bsdimp.com> <4FFC412B.4090202@gwdg.de> <20120710151115.GA56950@zim.MIT.EDU> <4FFC5E5D.8000502@gwdg.de> <20120710225801.GB58778@zim.MIT.EDU> <50101EDE.6030509@gwdg.de> <50102C8F.2080901@missouri.edu> <20120725173147.GA72824@troutmask.apl.washington.edu> <50102EF4.2080601@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Thu, 26 Jul 2012 02:39:47 +0000 Cc: freebsd-current@FreeBSD.org, David Schultz , Bruce Evans , Steve Kargl Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jul 2012 02:19:38 -0000 On Wed, 25 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/25/12 12:31, Steve Kargl wrote: >> On Wed, Jul 25, 2012 at 12:27:43PM -0500, Stephen Montgomery-Smith wrote: >>> Just as a point of comparison, here is the answer computed using >>> Mathematica: >>> >>> N[Exp[2], 50] >>> 7.3890560989306502272304274605750078131803155705518 >>> >>> As you can see, the expl solution has only a few digits more accuracy >>> that exp. >> >> Unless you are using sparc64 hardware. >> >> flame:kargl[204] ./testl -V 2 >> ULP = 0.2670 for x = 2.000000000000000000000000000000000e+00 >> mpfr exp: 7.389056098930650227230427460575008e+00 >> libm exp: 7.389056098930650227230427460575008e+00 > Yes. It would be nice if long on the Intel was as long as the sparc64. You want it to be as slow as sparc64? (About 300 times slower, after scaling the CPU clock rates. Doubles on sparc64 are less than 2 times slower.) I forgot to mention in a previous reply is that expl has only a few more decimal digits of accuracy than exp because the extra precision on x86 wasn't designed to give much more accuracy. It was designed to give more chance of full double precision accuracy in naive code. It was designed in ~1980 when bits were expensive and the extra 11 provided by the 8087 were considered the best tradeoff between cost and accuracy. They only previde 2-3 extra decimal digits of accuracy. They are best thought of as guard bits. Floating point uses 1 or 2 guard bits internally. 11 extends that significantly and externalizes it, but is far from doubling the number of bits. Their use to provide extra precision was mostly defeated in C by bad C bindings and implementations. This was consolidated by my not using the extra bits for the default rounding precision in FreeBSD. This has been further consolidated by SSE not supporting extended precision. Now the naive code that uses doubles never gets the extra precision on amd64. Mixing of long doubles with doubles is much slower with SSE+i387 than with i387, since the long doubles are handled in different registers and must be translated with SSE+i387, while with i387, using long doubles is almost free (it actually has a negative cost in non-naive code since it allows avoiding extra precision in software). Thus SSE also inhibits using the extra precision intentionally. Bruce