Date: Sat, 16 Dec 2017 17:15:01 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Bruce Evans <brde@optusnet.com.au> Cc: Steve Kargl <sgk@troutmask.apl.washington.edu>, freebsd-numerics@freebsd.org Subject: Re: progress on powl. Message-ID: <20171216163632.W1413@besplex.bde.org> In-Reply-To: <20171216135205.D971@besplex.bde.org> References: <20171216013325.GA27344@troutmask.apl.washington.edu> <20171216135205.D971@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 16 Dec 2017, Bruce Evans wrote: > ... > It is suprising that P* is accurate enough for pow(). exp() > exponentiates errors. It is apparently enough to calculate y*log2(x) > = n + y' accurately. Since |y'| <= 0.5, the errors don't get > ... > > My clog*(z) has similar problems for the real part. At the end, it must > calculate log*(x) where x has extra precision. Unfortunately, only > the internal kernels of logl() and of my uncommitted version of log() > and logf() support extra precision. Since I don't want to bloat clog() > by open-coding these internals like pow() does for exp()'s internals, > I just call the log*() functions. This blows out the error to up to ~1.8 > ulps. > [... using extra precision of exp*() kernels] Also, the internal extra precision of log*() might be enough for y*log2(x) = n+y'. It is enough for plain log2*(x). logl() has 8-10 extra bits internally, so log2l() has 6-8. Except for my versions, log() and logf() barely have enough bits to lose any for log2() and log2f(). Probably not enough for pow*(). y' also needs to retain extra precision, but that is routine using the method in log2l() (it just discards extra precision at the end). The 4.4BSD pow() is almost trivial for non-exceptional args using its kernels: X /* Return exp(y*log(x)), using simulated extended */ X /* precision for the log and the multiply. */ X X s = log__D(x); X t.a = y; X TRUNC(t.a); X t.b = y - t.a; X t.b = s.b*y + t.b*s.a; X t.a *= s.a; X s.a = t.a + t.b; X s.b = (t.a - s.a) + t.b; X return (exp__D(s.a, s.b)); while the log2l() non-kernel with debugging and i386 macros removed is: X long double X log2l(long double x) X { X struct ld r; X long double hi, lo; X X k_logl(x, &r); X if (!r.lo_set) X return (r.hi); X _2sumF(r.hi, r.lo); X hi = (float)r.hi; X lo = r.lo + (r.hi - hi); X return (invln2_hi * hi + X (invln2_lo + invln2_hi) * lo + invln2_lo * hi); X } You should see _2sumF() open-coded in this pow(). _2sumF() is delicate and the open-coding is probably broken. Compiler bugs break most implementations of 2sum*() if there is extra precision. _2sumF() has complications to work around this. The complications are mostly in the instructions in the comments for using it. For double precision, the the args must be double_t, not double. The 4.4BSD library doesn't know anything about this, so its open coding is just broken on i386. It is unclear if or how fdlibm pow() avoids these bugs. We fixed a couple for exp(). math_private.h has debugging variants of _2sumF() and variants with less restrictions on use. pow() also has to do a multiplication for this. We don't have any macros like _2sumF() to help for this. In general, for real values errors for extra-precision multiplications are unimportant since there are no cancelations. Howver a + b*c might need extra precision since there might be cancelations for the addition. Not a problem for the final step of pow*(), but a problem for the y' = y*log2(x) -n step. Another problem with exp__D() and log__L() is that their API is asymmetric. log__L() returns extra precision but doesn't take it. exp__D() takes extra precision but doesn't return it. Both as needed for pow() and perhaps *gamma*(), but too specialized. General versions were too much slower in C90, but now inline functions can take and return extra precision and hope that the compiler optimizes it away when the caller supplies it as arg.lo = 0 and the callee discards it by not using result.lo. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171216163632.W1413>