From owner-freebsd-standards@FreeBSD.ORG Thu Dec 6 03:35:40 2007 Return-Path: Delivered-To: freebsd-standards@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 85B1416A418; Thu, 6 Dec 2007 03:35:40 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id 1D7D613C457; Thu, 6 Dec 2007 03:35:39 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-219-213.carlnfd3.nsw.optusnet.com.au (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id lB63ZZvc021773 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 6 Dec 2007 14:35:37 +1100 Date: Thu, 6 Dec 2007 14:35:35 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: David Schultz In-Reply-To: <20071205185132.GA91591@VARK.MIT.EDU> Message-ID: <20071206141035.V10327@delplex.bde.org> References: <20070928152227.GA39233@troutmask.apl.washington.edu> <20071001173736.U1985@besplex.bde.org> <20071002001154.GA3782@troutmask.apl.washington.edu> <20071002172317.GA95181@VARK.MIT.EDU> <20071002173237.GA12586@troutmask.apl.washington.edu> <20071003103519.X14175@delplex.bde.org> <20071010204249.GA7446@troutmask.apl.washington.edu> <20071203074407.GA10989@VARK.MIT.EDU> <20071203145105.GA16203@troutmask.apl.washington.edu> <20071205185132.GA91591@VARK.MIT.EDU> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-standards@freebsd.org, Steve Kargl Subject: Re: long double broken on i386? X-BeenThere: freebsd-standards@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Standards compliance List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Dec 2007 03:35:40 -0000 On Wed, 5 Dec 2007, David Schultz wrote: > But honestly, I've tried to wrestle with the argument reduction > stuff before, and my advice is to not kill yourself over it. You > need tons of extra precision and an entirely different computation > for huge numbers, and there are other things you could spend your > time on that would have a bigger impact. The work for this is mostly already done. The new work is just to ensure correctness of data fed into and read out of a hopefully debugged algorithm. That must be done anyway. > If someone tries to > compute cosl(10^20 * pi/3) using libm, for example, they're going > to get the wrong answer anyway. When 10^20 * pi/3 is expressed in Why would anyone do that? :-) > extended precision, the rounding error is more than pi, so it > doesn't matter how accurately you compute the cosine because the > input is totally out of phase. While it might be nice to say that > we have accurate argument reduction and blah blah blah, but it's > of little practical value. The Intel ia64 library and IIRC, LIA, has functions which take an arg in degrees (cosd(), etc?) so that arg reduction is simpler and even huge integer args (in degrees) can be handled perfectly. It's strange that degrees can work better than radians. With only functions that take args in radians, the loss of precison goes the other way, with naive user code doing cosd(x) := cos(x * pi / 180) and losing a lot of accuracy even for x = 360. > That's not to say that we need no argument reduction at all. For > instance, cosl(5*pi/3) should still give an accurate answer. But > when the input is so huge that the exponent is bigger than the > number of significant bits in the mantissa, you lose so much > precision in the input that it's not as important anymore. That's > probably why Intel decided to make its trig instructions only work > up to 2^63 before requiring explicit argument reduction. They don't actually work up to 2^63. They work up to about 2^2 or 2^4 for extended precision and up to about 2^13 or 2^15 for double precision, since their internal approximation to pi only has 66 or 68 bits (ucbtest says 66(->68) bits; fdlibm has 3168 bits). When Intel decided this for i87's, they only had a budget of 25000 (?) gates and a few hundred million dollars (?). Their budget is larger now :-), but for ia64 they do it all in software (like fdlibm does I think, but much more MD and optimized, especially for arg reduction). Thus the old bugs of a bad internal approximation for pi and a broken indicator for when better arg reduction is needed are fixed on ia64. Bruce