From owner-freebsd-standards@FreeBSD.ORG Thu Dec 6 05:15:53 2007 Return-Path: Delivered-To: freebsd-standards@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 62C5A16A420 for ; Thu, 6 Dec 2007 05:15:53 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by mx1.freebsd.org (Postfix) with ESMTP id 338F113C46E for ; Thu, 6 Dec 2007 05:15:50 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-219-213.carlnfd3.nsw.optusnet.com.au (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id lB65FZWE024277 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 6 Dec 2007 16:15:42 +1100 Date: Thu, 6 Dec 2007 16:15:35 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Steve Kargl In-Reply-To: <20071205203235.GA40539@troutmask.apl.washington.edu> Message-ID: <20071206151017.U10456@delplex.bde.org> References: <20070928152227.GA39233@troutmask.apl.washington.edu> <20071001173736.U1985@besplex.bde.org> <20071002001154.GA3782@troutmask.apl.washington.edu> <20071002172317.GA95181@VARK.MIT.EDU> <20071002173237.GA12586@troutmask.apl.washington.edu> <20071003103519.X14175@delplex.bde.org> <20071010204249.GA7446@troutmask.apl.washington.edu> <20071203074407.GA10989@VARK.MIT.EDU> <20071203145105.GA16203@troutmask.apl.washington.edu> <20071205185132.GA91591@VARK.MIT.EDU> <20071205203235.GA40539@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-standards@freebsd.org Subject: Re: long double broken on i386? X-BeenThere: freebsd-standards@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Standards compliance List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Dec 2007 05:15:53 -0000 On Wed, 5 Dec 2007, Steve Kargl wrote: > On Wed, Dec 05, 2007 at 01:51:32PM -0500, David Schultz wrote: >> Having a version that works for machines with the 128-bit floating >> point format is pretty important. (These would probably be two >> separate files; I've been thinking we should add a msun/ieee96 >> directory and a msun/ieee128 directory or something.) > > This is probably a good idea if freebsd wants to maintain the > same algorithm across, say, sinf, sin, and sinl. I can produce > code for the ieee128 case, but I have no way to test it. I guarantee that it will have bugs if it is not tested :-). > An alternative may be to use table-driven implementations where > the intervals are defined by exactly representable values in ieee96 > (i.e., 64-bit significand), which are then also exactly representable > in ieee128. I haven't investigated how many intervals one would > need nor the best interpolation scheme. My best implementation of logl does this to a large extent (see your mailbox). It uses an variable interval length of ~1/128 and rounds the endpoints to float precision (endpoints are round(1 + i/128)). Full precision is still needed for the values at the endpoints, but the full-precision values are represented as sums of lower-precision values to optimize for amd64 and i386. This won't work so well for sparc64. The interval length of 1/128 gives a rounding error for the polynomial of about (1/128)^2 53-bit ULPs = 2^-14 53-bit ULPs = 2^-3 64-bit ULPs, (due to the coeff of the x^2 term possibly having a 1 53-bit ULP rounding error), and the Remez approximation method reduces this to about 2^-11 64-bit ULPs, with 113-bit precision and 53-bit coeffs the interval length needs to be more like sqrt(2^(53-113)) = 2^-30 which is too small to implement. Hopefully Remez helps even more here. The trig family (especially cos) doesn't need such a small interval as log(), since its power series converge faster and have every second term 0 and thus exactly representable. You don't want to do this initially for the trig family since it adds complexity. The Intel ia64 library uses intervals for almost everything, including the trig family. Splicing at the endpoints and relating the endpoints to multiples of pi/2 makes it more complicated for the trig family than for the (real) log family. Bruce