From owner-freebsd-standards@FreeBSD.ORG Wed Oct 3 01:13:44 2007 Return-Path: Delivered-To: freebsd-standards@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E65A716A419 for ; Wed, 3 Oct 2007 01:13:43 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail16.syd.optusnet.com.au (mail16.syd.optusnet.com.au [211.29.132.197]) by mx1.freebsd.org (Postfix) with ESMTP id 8E8B413C448 for ; Wed, 3 Oct 2007 01:13:43 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c220-239-235-248.carlnfd3.nsw.optusnet.com.au (c220-239-235-248.carlnfd3.nsw.optusnet.com.au [220.239.235.248]) by mail16.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id l931Dc02020774 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 3 Oct 2007 11:13:40 +1000 Date: Wed, 3 Oct 2007 11:13:38 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Steve Kargl In-Reply-To: <20071002173237.GA12586@troutmask.apl.washington.edu> Message-ID: <20071003103519.X14175@delplex.bde.org> References: <20070928152227.GA39233@troutmask.apl.washington.edu> <20071001173736.U1985@besplex.bde.org> <20071002001154.GA3782@troutmask.apl.washington.edu> <20071002172317.GA95181@VARK.MIT.EDU> <20071002173237.GA12586@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-standards@FreeBSD.org Subject: Re: long double broken on i386? X-BeenThere: freebsd-standards@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Standards compliance List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2007 01:13:44 -0000 On Tue, 2 Oct 2007, Steve Kargl wrote: > On Tue, Oct 02, 2007 at 01:23:18PM -0400, David Schultz wrote: >> ... >> Anyway, my point is that if you have something that works >> reasonably well and doesn't have egregious errors, my suggestion >> is to just commit it and not kill yourself over whether the >> argument reduction is correct in the last ulp. > There are a few problems: 1) I don't have commit access. 2) I > need to do style(9) clean-up pass. 3) I've only tested these > on i386 and amd64, and I know these fail for sparc64. 4) Most > importantly, I don't know how the project wants to handle the > 53 vs 64 bit default fpu setting on i386. (3) is most important. The C versions will never even be used for i386 or amd64, since these arches have sin and cos in hardware and (I believe) it is impossible to beat the hardware on these arches. The asm versions have always been trivial to implement (change 2 bytes in s_sin.S (compare e_sqrt.S with e_sqrtf.S; the long double changes go the other way). amd64 actually uses the C versions for double precision becase the C versions are competitive with the hardware for speed and beat it for accuracy on large args and on small args aear a multiple of pi/2. (4): These functions are useless unless the default is changed. Users have always been able to get equivalent _in_accuracy on i386 by just using the double precision versions. Hacks like #define sinl sin almost work for just getting things to compile. In fact, on i386, due to the bugfeatures outlined in my previous mail, users can even get full long double accuracy using hacks like that, even if the rounding precision is 53 bits: %%% #include long double x: main() { x = (long double)sin(1); } %%% On i386, this stores the full long double result for sinl(1) in x, due to the bugfeatures -- sin(1) returns extra precision in a register, and gcc doesn't understand the extra precision so it just stores all bits of the register in x. The behaviour would be similar if sin(1) were used in an expression. On amd64, the ABI prevents this from working (doubles must be returned in a 64-bit XMM register). > PS: There is also the minor possibility of giving bde either > a heartache or death via laughter when he finally sees what I > have done. :) I would be unhappy with any versions that aren't either: - as identical as possible in algorithm and style to the double precision versions, in the same way that the float versions are as identical as possible, or - much better in algorithm and style and efficiency and accuracy than the double precision versions. I don't want to maintain different versions. I tried to turn your original efforts for logl() into the latter for logf(), log() and logl(), but so far they are only a little better. In particular they are not good enough to commit due to point (3) -- the long double version is only implemented for machines with 80-bit long doubles, and all versions only beat the hardware for efficiency on some machines (amd64, but not i386 with P[2-4]). I think the long double version beats the hardware for accuracy but haven't finished testing this. The hardware is much easier to beat for efficiency for the log family than for the sin family. The hardware is much easier to beat for accuracy for logl() than for anything else. The hardware is accurate enough for logf() and logl(), so is unbeatable for accuracy of these functions unless a slow version is used to get more accuracy. Bruce