Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Jul 2012 13:35:05 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Stephen Montgomery-Smith <stephen@missouri.edu>
Cc:        freebsd-numerics@FreeBSD.org
Subject:   Re: Bad error computing 1/z
Message-ID:  <20120730131929.Y876@besplex.bde.org>
In-Reply-To: <5015F9E9.9070408@missouri.edu>
References:  <5015DA8A.10200@missouri.edu> <20120730025118.GB7958@server.rulingia.com> <5015F9E9.9070408@missouri.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 29 Jul 2012, Stephen Montgomery-Smith wrote:

> On 07/29/2012 09:51 PM, Peter Jeremy wrote:
>> [Excuse the long lines]
>> 
>> On 2012-Jul-29 19:51:22 -0500, Stephen Montgomery-Smith 
>> <stephen@missouri.edu> wrote:
>>> As I was debugging catanh, I noticed the following oddness.
>>> 
>>> If z = cpack(x,y), where
>>> x = 1
>>> y = 0x1.25691d4068c910p+512, approximately 1.53672e+154
>>> then the real part of 1/z is wrong by about 4ULP.  The real part of 1/z
>>> is approx 4.2346e-309.  I know the reciprocal of this would cause an
>>> overflow.
>> ...
>> Note that since the real part is a denormal, it only has 50 bits of
>> precision.  Are you sure your ULP calculations are taking into account
>> the number of fraction bits available when numbers are denormals?
>
> You answered my question.  The real part is denormal, and so my ULP 
> calculation were misleading.

A large error is expected anyway.  The calculation involves 1/(x*x+y*y).
gcc won't be as careful as hypot(), so it will give various overflow and
underflow and accuracy errors for x*x+y*y.  After avoiding the overflow
and underflow errors, I think x*x+y*y has an accuracy error of at most
1 ulp.  The division adds another 0.5 ulps.

1/z calculated by gcc is much more accurate than the singular function
z*z :-).  The real part of the latter is x*x-y*y.  This may have an
error of almost 2**MANT_DIG ulps due to cancelation.

The C has some examples with pseudocode for some of the worst problems.
This code is too slow to actually use.  I think only hardware overflow
and underflow traps can be fast enough to use (when they don't occur
often).  Anything that uses fenv to test for exceptions after every
operation is too slow, and testing after every few operation makes it
hard to recover.  A "few" probably needs to be a few hundreds or thousands
since fenv accesses take about as long as a few tens or hundreds of
additions or multiplications on at least x86.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120730131929.Y876>