Date: Sun, 12 Aug 2012 23:09:59 -0000 From: Bruce Evans <brde@optusnet.com.au> To: Stephen Montgomery-Smith <stephen@missouri.edu> Cc: Diane Bruce <db@db.net>, Bruce Evans <brde@optusnet.com.au>, John Baldwin <jhb@freebsd.org>, David Chisnall <theraven@freebsd.org>, Bruce Evans <bde@freebsd.org>, Steve Kargl <sgk@troutmask.apl.washington.edu>, David Schultz <das@freebsd.org>, Peter Jeremy <peter@rulingia.com>, Warner Losh <imp@bsdimp.com> Subject: Re: Use of C99 extra long double math functions after r236148 Message-ID: <20120720162953.N2162@besplex.bde.org> Resent-Message-ID: <20120812230952.GQ20453@server.rulingia.com> In-Reply-To: <50085441.4090305@missouri.edu> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: First a bit from a previous reply: bde: >> WIll look more closely later. I see that you already use log1p() and much >> more. Apple clog() uses not so much more, mainly by depending on extra >> precision in hardware. The above also avoids overflow and use of hypot() >> for all finite x and and y, but is probably too simple. Now looked more closely. I put the real part of it it in my test framework as loghypot() and tested this against alternative versions like ordinary hypot(). stephen: > The Apple solution has a problem. The two invocations of log might > produce results that are nearly identical, but with opposite signs. > Think about x = y = 1/sqrt(2). Simple log(hypot()) has huge errors, but the Apple version seemed to be generally accurate and was much more accurate near |z| = 1. This pointed to some bugs in your version. These show up before any cancelation bugs in the Apple version. > I think their solution merely avoids the overflow/underflow problem, and was > not meant to address the problem I worked on. However, their solution will > fail if y=1e100 and y=1e-100. > > This caused me to realize that my solution failed to account for underflow, > so here is my next iteration. Tested this version. First I reformatted it all (the comment indentation was furthest from KNF). indent(1) did a good job and I only reformatted a couple of long lines manually: % #include <complex.h> % #include <float.h> % #include <math.h> % #include <fenv.h> % % #include "math_private.h" % % /* % * gcc doesn't implement complex multiplication or division correctly, so we % * need to handle infinities specially. We turn on this pragma to notify % * conforming c99 compilers that the fast-but-incorrect code that gcc % * generates is acceptable, since the special cases have already been % * handled. % */ % #pragma STDC CX_LIMITED_RANGE ON % % double complex % clog(double complex z) % { % double x, y, h, t1, t2, t3; % double x0, y0, x1, y1; % % x = creal(z); % y = cimag(z); % % #define NANMIX_APPLE_CLOG_COMPAT 1 % /* Handle special cases when x or y is NAN. */ % if (isnan(x)) { % if (isinf(y)) % return (cpack(INFINITY, NAN)); % else { % #if NANMIX_HYPOTF_COMPAT % y = fabs(y + 0); % t1 = (y - y) / (y - y); /* Raise invalid flag if y is % * not NaN */ % t1 = fabs(x + 0) + t1; /* Mix NaN(s). */ % #elif NANMIX_APPLE_CLOG_COMPAT % /* No actual mixing. */ % return (cpack(x, copysign(x, y))); % #else % t1 = (y - y) / (y - y); /* Raise invalid flag if y is % * not NaN */ % #endif % return (cpack(t1, t1)); % } % } else if (isnan(y)) { % if (isinf(x)) % return (cpack(INFINITY, NAN)); % else { % #ifdef NANMIX_HYPOTF_COMPAT % x = fabs(x + 0); % t1 = (x - x) / (x - x); /* Raise invalid flag if x is % * not NaN */ % t1 = t1 + fabs(y + 0); /* Mix NaN(s). */ % #elif NANMIX_APPLE_CLOG_COMPAT % /* No actual mixing. */ % return (cpack(y, y)); % #else % t1 = (x - x) / (x - x); /* Raise invalid flag if x is % * not NaN */ % #endif % return (cpack(t1, t1)); % } Here I added sqillions of ifdefs so that the results come out the same for NaNs as with log(hypot()) and with apple clog(). This is not very important, but my tests report all differences and it is hard to see important differences when there are many for NaNs. In general, the result of an operation with 1 NaN are should be the same NaN quieted, and the result of an operation with 2 NaN args should combine both args in a similar way to a hardware operation. For hypot(x, y), the result should be the same as from the naive formula sqrt(x*x+y*y) executed by the hardware. The real part of clog() should be the result of log() on this (according to the simpler rules for 1 NaN). Unfortunately, fdlibm hypot() historically first replaces x and y by |x| and |y|. fabs() clears the sign bit even for NaNs, and the result is quite different from that given by the naive formula. The above adds some fabs()'s and other tweaks for consistency with hypot(). I don't try hard to make the imaginary part consistent with atan2(). It would be simpler to just call hypot() for NaN cases. Apple clog() mostly doesn't mix pairs of NaNs. It mostly doesn't even try to apply the default conversions to a single NaN. The above adds another set of tweaks for consistency with it. % } else if (isfinite(x) && isfinite(y) && % (fabs(x) > 1e308 || fabs(y) > 1e308)) % /* % * To avoid unnecessary overflow, if x or y are very large, % * divide x and y by M_E, and then add 1 to the logarithm. % * This depends on M_E being larger than sqrt(2). % */ % return (cpack(log(hypot(x / M_E, y / M_E) + 1), atan2(y, x))); % else if (fabs(x) < 1e-50 || fabs(y) < 1e-50 || % fabs(x) > 1e50 || fabs(y) > 1e50) % /* % * Because atan2 and hypot conform to C99, this also covers % * all the edge cases when x or y are 0 or infinite. % */ % return (cpack(log(hypot(x, y)), atan2(y, x))); % else { % /* We don't need to worry about overflow in x*x+y*y. */ % h = x * x + y * y; % if (h < 0.1 || h > 10) % return (cpack(log(h) / 2, atan2(y, x))); % /* Take extra care if h is moderately close to 1 */ % else { % /* % * x0 and y0 are good approximations to x and y, but % * have their bits trimmed so that double precision % * floating point is capable of calculating x0*x0 + % * y0*y0 - 1 exactly. % */ % x0 = ldexp(floor(ldexp(x, 24)), -24); % x1 = x - x0; % y0 = ldexp(floor(ldexp(y, 24)), -24); % y1 = y - y0; This fails for example when x is near 1 and y is tiny. Say x == 1 and y == +-1e-19. The result for the real part should be about y**2/2 (almost exactly for y this tiny I think), but the above makes the following messes, since the ldexp()s don't actually trim to 24 bits: - for tiny y > 0, y times 2**24 is < 1 and floor() gives 0; x0 is 0 and x1 is x. The error has been moved to a worse place. - for tiny y < 0, y times 2**24 is > -1 and floor() gives -1; x0 is -2**-24, which is much larger in magnitude than y, and x1 is about as large to compensate. The error has been increased as well as moved. Since y is tiny, the final result was just 0 in all cases that I looked at. Apple clog() handles this case by using the y**2/2 approximation. The result of 0 instead of y**2/2 is about 9 GULPS of error, even when eveything is reduced to float precision (multiply this by 2**29 for the error in in double precision. I don't remember the SI prefixes that high). I tried simple fixes, but they made the total number of errors larger. ldexp() is slow, and the following are standard ways of trimming low bits much mure efficiently and correctly: - fdlibm mostly uses GET/EXTRACT to extract the bits, then masks low bits, then uses SET/INSERT to put the changed bits in an FP variable. (The naming differences are historical bugs.) - for double precision, most hi+lo decompositions split the double almost in half, with 26 hi bits and 27 lo bits. But often you only need 24 hi bits. Then it is usually faster to assign the double to a float. to get 24 hi bits. This requires that the double be small enough to be almost representable as a float: x0 = (float)x; x1 = x - x0; Here x0 can be float or the same type as x, or even different. On i386, there are compiler bugs which may require using STRICT_ASSIGN() here (at least if x is an expression). The compiler bugs are best avoided by working throughout with variables of type double_t (for double functions). - a more general method is to add and subtract a magic number like 0x1p40 or 0x1.8p40: x0 = x + 0x1p40 - 0x1p40; x1 = x - x0; This is usually faster, but more fragile. The magic 40 trims to 24 (+-1?) bits, provided the evaluation precision is 64 bits and there are no compiler bugs. On i386, the evaluation precision is variable but usually 53, and libm uses code like the above with magic for 53 in a couple of places. In my attempted simple fix, I used the float cast method. This failed because y = +-1e-19 is too tiny for anything good to happen. The actual y was already exactly representable in 24 bits, so y0 was y and y1 was 0... % /* Notice that mathematically, h = t1*(1+t3). */ % t1 = x0 * x0 + y0 * y0; ... t1 = 1 + 1e-38 rounds to 1, and there is no extra precision. I tried logl(hypotl()). The extra precision provided by long doubles was very far from making much difference. hypot()s methods don't seem to apply, since for tiny y hypot(1, y) ~= 1 + y**2/2 ~= 1, so the case of tiny y is especially easy for hypot(). % t2 = 2 * x0 * x1 + x1 * x1 + 2 * y0 * y1 + y1 * y1; % t3 = t2 / t1; % return (cpack((log1p(t1 - 1) + log1p(t3)) / 2, % atan2(y, x))); Isn't log1p(t1 -1) no different from log(t1)? % } % } % } I was going to leave all the fixes to you, but now want to try the approximation for tiny y :-). Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120720162953.N2162>