Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 Aug 2012 23:09:59 -0000
From:      Bruce Evans <brde@optusnet.com.au>
To:        Stephen Montgomery-Smith <stephen@missouri.edu>
Cc:        Diane Bruce <db@db.net>, Bruce Evans <brde@optusnet.com.au>, John Baldwin <jhb@freebsd.org>, David Chisnall <theraven@freebsd.org>, Bruce Evans <bde@freebsd.org>, Steve Kargl <sgk@troutmask.apl.washington.edu>, David Schultz <das@freebsd.org>, Peter Jeremy <peter@rulingia.com>, Warner Losh <imp@bsdimp.com>
Subject:   Re: Use of C99 extra long double math functions after r236148
Message-ID:  <20120720162953.N2162@besplex.bde.org>
Resent-Message-ID: <20120812230952.GQ20453@server.rulingia.com>
In-Reply-To: <50085441.4090305@missouri.edu>
References:  <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote:

First a bit from a previous reply:

bde:
>> WIll look more closely later.  I see that you already use log1p() and much
>> more.  Apple clog() uses not so much more, mainly by depending on extra
>> precision in hardware.  The above also avoids overflow and use of hypot()
>> for all finite x and and y, but is probably too simple.

Now looked more closely.  I put the real part of it it in my test
framework as loghypot() and tested this against alternative versions
like ordinary hypot().

stephen:
> The Apple solution has a problem.  The two invocations of log might 
> produce results that are nearly identical, but with opposite signs. 
> Think about x = y = 1/sqrt(2).

Simple log(hypot()) has huge errors, but the Apple version seemed to be
generally accurate and was much more accurate near |z| = 1.  This pointed
to some bugs in your version.  These show up before any cancelation bugs
in the Apple version.

> I think their solution merely avoids the overflow/underflow problem, and was 
> not meant to address the problem I worked on.  However, their solution will 
> fail if y=1e100 and y=1e-100.
>
> This caused me to realize that my solution failed to account for underflow, 
> so here is my next iteration.

Tested this version.  First I reformatted it all (the comment indentation
was furthest from KNF).  indent(1) did a good job and I only reformatted
a couple of long lines manually:

% #include <complex.h>
% #include <float.h>
% #include <math.h>
% #include <fenv.h>
% 
% #include "math_private.h"
% 
% /*
%  * gcc doesn't implement complex multiplication or division correctly, so we
%  * need to handle infinities specially. We turn on this pragma to notify
%  * conforming c99 compilers that the fast-but-incorrect code that gcc
%  * generates is acceptable, since the special cases have already been
%  * handled.
%  */
% #pragma	STDC CX_LIMITED_RANGE	ON
% 
% double	complex
% clog(double complex z)
% {
% 	double x, y, h, t1, t2, t3;
% 	double x0, y0, x1, y1;
% 
% 	x = creal(z);
% 	y = cimag(z);
% 
% #define	NANMIX_APPLE_CLOG_COMPAT	1
% 	/* Handle special cases when x or y is NAN. */
% 	if (isnan(x)) {
% 		if (isinf(y))
% 			return (cpack(INFINITY, NAN));
% 		else {
% #if NANMIX_HYPOTF_COMPAT
% 			y = fabs(y + 0);
% 			t1 = (y - y) / (y - y);	/* Raise invalid flag if y is
% 						 * not NaN */
% 	       		t1 = fabs(x + 0) + t1;	/* Mix NaN(s). */
% #elif NANMIX_APPLE_CLOG_COMPAT
% 			/* No actual mixing. */
% 			return (cpack(x, copysign(x, y)));
% #else
% 			t1 = (y - y) / (y - y);	/* Raise invalid flag if y is
% 						 * not NaN */
% #endif
% 			return (cpack(t1, t1));
% 		}
% 	} else if (isnan(y)) {
% 		if (isinf(x))
% 			return (cpack(INFINITY, NAN));
% 		else {
% #ifdef NANMIX_HYPOTF_COMPAT
% 			x = fabs(x + 0);
% 			t1 = (x - x) / (x - x);	/* Raise invalid flag if x is
% 						 * not NaN */
% 	       		t1 = t1 + fabs(y + 0);	/* Mix NaN(s). */
% #elif NANMIX_APPLE_CLOG_COMPAT
% 			/* No actual mixing. */
% 			return (cpack(y, y));
% #else
% 			t1 = (x - x) / (x - x);	/* Raise invalid flag if x is
% 						 * not NaN */
% #endif
% 			return (cpack(t1, t1));
% 		}

Here I added sqillions of ifdefs so that the results come out the same
for NaNs as with log(hypot()) and with apple clog().  This is not very
important, but my tests report all differences and it is hard to see
important differences when there are many for NaNs.

In general, the result of an operation with 1 NaN are should be the
same NaN quieted, and the result of an operation with 2 NaN args should
combine both args in a similar way to a hardware operation.  For
hypot(x, y), the result should be the same as from the naive formula
sqrt(x*x+y*y) executed by the hardware.  The real part of clog() should
be the result of log() on this (according to the simpler rules for 1
NaN).

Unfortunately, fdlibm hypot() historically first replaces x and y by
|x| and |y|.  fabs() clears the sign bit even for NaNs, and the result
is quite different from that given by the naive formula.  The above
adds some fabs()'s and other tweaks for consistency with hypot().  I
don't try hard to make the imaginary part consistent with atan2().  It
would be simpler to just call hypot() for NaN cases.

Apple clog() mostly doesn't mix pairs of NaNs.  It mostly doesn't even
try to apply the default conversions to a single NaN.  The above adds
another set of tweaks for consistency with it.

% 	} else if (isfinite(x) && isfinite(y) &&
% 	    (fabs(x) > 1e308 || fabs(y) > 1e308))
% 		/*
% 		 * To avoid unnecessary overflow, if x or y are very large,
% 		 * divide x and y by M_E, and then add 1 to the logarithm.
% 		 * This depends on M_E being larger than sqrt(2).
% 		 */
% 		return (cpack(log(hypot(x / M_E, y / M_E) + 1), atan2(y, x)));
% 	else if (fabs(x) < 1e-50 || fabs(y) < 1e-50 ||
% 	    fabs(x) > 1e50 || fabs(y) > 1e50)
% 		/*
% 		 * Because atan2 and hypot conform to C99, this also covers
% 		 * all the edge cases when x or y are 0 or infinite.
% 		 */
% 		return (cpack(log(hypot(x, y)), atan2(y, x)));
% 	else {
% 		/* We don't need to worry about overflow in x*x+y*y. */
% 		h = x * x + y * y;
% 		if (h < 0.1 || h > 10)
% 			return (cpack(log(h) / 2, atan2(y, x)));
% 		/* Take extra care if h is moderately close to 1 */
% 		else {
% 			/*
% 			 * x0 and y0 are good approximations to x and y, but
% 			 * have their bits trimmed so that double precision
% 			 * floating point is capable of calculating x0*x0 +
% 			 * y0*y0 - 1 exactly.
% 			 */
% 			x0 = ldexp(floor(ldexp(x, 24)), -24);
% 			x1 = x - x0;
% 			y0 = ldexp(floor(ldexp(y, 24)), -24);
% 			y1 = y - y0;

This fails for example when x is near 1 and y is tiny.  Say x == 1 and
y == +-1e-19.  The result for the real part should be about y**2/2
(almost exactly for y this tiny I think), but the above makes the
following messes, since the ldexp()s don't actually trim to 24 bits:
- for tiny y > 0, y times 2**24 is < 1 and floor() gives 0; x0 is 0
   and x1 is x.  The error has been moved to a worse place.
- for tiny y < 0, y times 2**24 is > -1 and floor() gives -1; x0 is
   -2**-24, which is much larger in magnitude than y, and x1 is about
   as large to compensate.  The error has been increased as well as
   moved.
Since y is tiny, the final result was just 0 in all cases that I looked
at.  Apple clog() handles this case by using the y**2/2 approximation.

The result of 0 instead of y**2/2 is about 9 GULPS of error, even when
eveything is reduced to float precision (multiply this by 2**29 for the
error in in double precision.  I don't remember the SI prefixes that
high).

I tried simple fixes, but they made the total number of errors larger.

ldexp() is slow, and the following are standard ways of trimming low
bits much mure efficiently and correctly:
- fdlibm mostly uses GET/EXTRACT to extract the bits, then masks low
   bits, then uses SET/INSERT to put the changed bits in an FP variable.
   (The naming differences are historical bugs.)
- for double precision, most hi+lo decompositions split the double almost
   in half, with 26 hi bits and 27 lo bits.  But often you only need 24
   hi bits.  Then it is usually faster to assign the double to a float.
   to get 24 hi bits.  This requires that the double be small enough to
   be almost representable as a float:
 	x0 = (float)x;
 	x1 = x - x0;
   Here x0 can be float or the same type as x, or even different.  On
   i386, there are compiler bugs which may require using STRICT_ASSIGN()
   here (at least if x is an expression).  The compiler bugs are best
   avoided by working throughout with variables of type double_t (for
   double functions).
- a more general method is to add and subtract a magic number like 0x1p40
   or 0x1.8p40:
 	x0 = x + 0x1p40 - 0x1p40;
 	x1 = x - x0;
   This is usually faster, but more fragile.  The magic 40 trims to 24
   (+-1?) bits, provided the evaluation precision is 64 bits and there
   are no compiler bugs.  On i386, the evaluation precision is variable
   but usually 53, and libm uses code like the above with magic for 53
   in a couple of places.

In my attempted simple fix, I used the float cast method.  This failed
because y = +-1e-19 is too tiny for anything good to happen.  The
actual y was already exactly representable in 24 bits, so y0 was y
and y1 was 0...

% 			/* Notice that mathematically, h = t1*(1+t3). */
% 			t1 = x0 * x0 + y0 * y0;

... t1 = 1 + 1e-38 rounds to 1, and there is no extra precision.

I tried logl(hypotl()).  The extra precision provided by long doubles
was very far from making much difference.

hypot()s methods don't seem to apply, since for tiny y hypot(1, y) ~=
1 + y**2/2 ~= 1, so the case of tiny y is especially easy for hypot().

% 			t2 = 2 * x0 * x1 + x1 * x1 + 2 * y0 * y1 + y1 * y1;
% 			t3 = t2 / t1;
% 			return (cpack((log1p(t1 - 1) + log1p(t3)) / 2,
% 			    atan2(y, x)));

Isn't log1p(t1 -1) no different from log(t1)?

% 		}
% 	}
% }

I was going to leave all the fixes to you, but now want to try the
approximation for tiny y :-).

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120720162953.N2162>