Date: Sun, 12 Aug 2012 23:10:13 -0000 From: Bruce Evans <brde@optusnet.com.au> To: Stephen Montgomery-Smith <stephen@missouri.edu> Cc: Diane Bruce <db@db.net>, Bruce Evans <brde@optusnet.com.au>, John Baldwin <jhb@freebsd.org>, David Chisnall <theraven@freebsd.org>, Bruce Evans <bde@freebsd.org>, Steve Kargl <sgk@troutmask.apl.washington.edu>, David Schultz <das@freebsd.org>, Peter Jeremy <peter@rulingia.com>, Warner Losh <imp@bsdimp.com> Subject: Re: Use of C99 extra long double math functions after r236148 Message-ID: <20120721011112.D5008@besplex.bde.org> Resent-Message-ID: <20120812231005.GT20453@server.rulingia.com> In-Reply-To: <50095CDE.4050507@missouri.edu> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > I worked quite a bit on my clog function last night. Could you look at this > one? Will look later. Another set of fixes for the old one now. The maximum error now seems to be about 1 ulp. I prefer it inline, but can duplicate it. % } else if (isfinite(x) && isfinite(y) && % (fabs(x) > 1e308 || fabs(y) > 1e308)) % /* % * To avoid unnecessary overflow, if x or y are very large, % * divide x and y by M_E, and then add 1 to the logarithm. % * This depends on M_E being larger than sqrt(2). % */ % return (cpack(log(hypot(x / M_E, y / M_E)) + 1, atan2(y, x))); Add 1 in the right place here. % else if (fabs(x) < 1e-50 && fabs(y) != 1 || % fabs(y) < 1e-50 && fabs(x) != 1 || % fabs(x) > 1e50 || fabs(y) > 1e50) Old modifications for |x| == 1 and |y| == 1. This needs to be expanded a bit, since cases near 1 are now handled too. % /* % * Because atan2 and hypot conform to C99, this also covers % * all the edge cases when x or y are 0 or infinite. % */ % return (cpack(log(hypot(x, y)), atan2(y, x))); % else { % /* We don't need to worry about overflow in x*x+y*y. */ % h = x * x + y * y; It is also good to avoid underflow. Underflow used to be avoided by using hypot() if x or y is tiny, but the above modification lets these cases through to here when |x| or |y| is 1. The square then underflows for values below about 1e-172. Except on i386, extra exponent range may prevent underflow, depending on optimization. To be fixed. % if (h < 0.1 || h > 10) % return (cpack(log(h) * 0.5, atan2(y, x))); Start replacing multiplications by divisions. % /* Take extra care if h is moderately close to 1 */ % else { % double ox = x; % double oy = y; I want to take absolute values and swap x and y so that 0 <= y <= x, so preserve the original values here. % x = fabs(x); % y = fabs(y); % if (x < y) { % x0 = x; % x = y; % y = x0; % } Normalize. % /* % * x0 and y0 are good approximations to x and y, but % * have their bits trimmed so that double precision % * floating point is capable of calculating x0*x0 + % * y0*y0 - 1 exactly. % */ It wasn't exact, even after fixing the hi+lo decompositions of x and y. Only the products were. The following fixes both. % x0 = (float)x; % x1 = x - x0; % y0 = (float)y; % y1 = y - y0; A good way to do the hi+lo decompositions. % if (y0 < 1e-50) { % t2 = y / x; % return (cpack(logf(x) + t2 * (t2 * 0.5), % atan2(oy, ox))); % } When y is tiny, this is necessary to handle the case x == 1. It also handles many nearby cases. Note that when y is tiny, x is fairly large, since h is fairly large. Thus t2 is also tiny. The formula is sloppy when x is not exactly 1, but works OK. t2*t2/2 must be parenthesized as above to avoid spurious underflow when t2 ~= sqrt(smallest denormal). Spurious underflow gives very large relative errors (I forget if they were ~8 ulps, ~436 ulps, or mega-ulps...). One of the main reasons for this special case is to avoid this spurious underflow. I developed this mainly for the float precision case on i386 and saw confusing behaviour when i386's extra precision sometimes avoided the underflow. % /* Notice that mathematically, h = t1*(1+t3). */ % t1 = x0 * x0; /* Exact. */ % t2 = y0 * y0; /* Exact. */ % STRICT_ASSIGN(double, t3, t1 + t2); % t2 = (t1 - t3) + t2; % t1 = t3; /* Now t1+t2 is hi+lo for x0*x0+y0*y0.*/ t1+t2 is not necessarily exact, and we must do multi-precision addition of it to get enough accuracy. Even this is not accurate enough when the final t3 is tiny. % t2 += 2 * x0 * x1 + x1 * x1 + 2 * y0 * y1 + y1 * y1; Add the rest of the low terms to t2. As before, except now t2 starts as usually nonzero to hold the low part of x0*x0+y0*y0. % t3 = t2 / t1; t2 is tiny relative to t1 (about 2**53 times smaller), so the accuracy of this is unimportant, except when t1 is near 1, in which case log() of it reduces it to near 1 and the error becomes dominated by that of t3. The special case for tiny y above is supposed to avoid reaching here in such cases, but the classification above is sloppy so it isn't clear that it does. Ww want to avoid reaching here unless t1 is a few tens or hundreds or thousands of ulps away from 1. % return (cpack((log(t1) + log1p(t3)) * 0.5, % atan2(oy, ox))); Suppose that t1 is exactly 1 and we reach here. t3 is only guaranteed to be less than about 2**-53. Even when it is 2**-205 times larger than that, we have a large error for it. One bad case is when it underflows. log1p(t3) * 0.5 gives a grouping of terms which allows t3 to underflow to 0. We can't put the factor of 0.5 inside log1p() without using a special case as above. Plain log(t1) is the same as log1p(t1 - 1) here. % } % } % } Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120721011112.D5008>