From owner-freebsd-numerics@freebsd.org Sat Dec 16 06:15:12 2017 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 55E58E9EC34 for ; Sat, 16 Dec 2017 06:15:12 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail109.syd.optusnet.com.au (mail109.syd.optusnet.com.au [211.29.132.80]) by mx1.freebsd.org (Postfix) with ESMTP id CB9AB797A8 for ; Sat, 16 Dec 2017 06:15:11 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail109.syd.optusnet.com.au (Postfix) with ESMTPS id 27535D6329A; Sat, 16 Dec 2017 17:15:02 +1100 (AEDT) Date: Sat, 16 Dec 2017 17:15:01 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bruce Evans cc: Steve Kargl , freebsd-numerics@freebsd.org Subject: Re: progress on powl. In-Reply-To: <20171216135205.D971@besplex.bde.org> Message-ID: <20171216163632.W1413@besplex.bde.org> References: <20171216013325.GA27344@troutmask.apl.washington.edu> <20171216135205.D971@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=bc8baKHB c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=pWyhuXww-wSbjqJyOqQA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Dec 2017 06:15:12 -0000 On Sat, 16 Dec 2017, Bruce Evans wrote: > ... > It is suprising that P* is accurate enough for pow(). exp() > exponentiates errors. It is apparently enough to calculate y*log2(x) > = n + y' accurately. Since |y'| <= 0.5, the errors don't get > ... > > My clog*(z) has similar problems for the real part. At the end, it must > calculate log*(x) where x has extra precision. Unfortunately, only > the internal kernels of logl() and of my uncommitted version of log() > and logf() support extra precision. Since I don't want to bloat clog() > by open-coding these internals like pow() does for exp()'s internals, > I just call the log*() functions. This blows out the error to up to ~1.8 > ulps. > [... using extra precision of exp*() kernels] Also, the internal extra precision of log*() might be enough for y*log2(x) = n+y'. It is enough for plain log2*(x). logl() has 8-10 extra bits internally, so log2l() has 6-8. Except for my versions, log() and logf() barely have enough bits to lose any for log2() and log2f(). Probably not enough for pow*(). y' also needs to retain extra precision, but that is routine using the method in log2l() (it just discards extra precision at the end). The 4.4BSD pow() is almost trivial for non-exceptional args using its kernels: X /* Return exp(y*log(x)), using simulated extended */ X /* precision for the log and the multiply. */ X X s = log__D(x); X t.a = y; X TRUNC(t.a); X t.b = y - t.a; X t.b = s.b*y + t.b*s.a; X t.a *= s.a; X s.a = t.a + t.b; X s.b = (t.a - s.a) + t.b; X return (exp__D(s.a, s.b)); while the log2l() non-kernel with debugging and i386 macros removed is: X long double X log2l(long double x) X { X struct ld r; X long double hi, lo; X X k_logl(x, &r); X if (!r.lo_set) X return (r.hi); X _2sumF(r.hi, r.lo); X hi = (float)r.hi; X lo = r.lo + (r.hi - hi); X return (invln2_hi * hi + X (invln2_lo + invln2_hi) * lo + invln2_lo * hi); X } You should see _2sumF() open-coded in this pow(). _2sumF() is delicate and the open-coding is probably broken. Compiler bugs break most implementations of 2sum*() if there is extra precision. _2sumF() has complications to work around this. The complications are mostly in the instructions in the comments for using it. For double precision, the the args must be double_t, not double. The 4.4BSD library doesn't know anything about this, so its open coding is just broken on i386. It is unclear if or how fdlibm pow() avoids these bugs. We fixed a couple for exp(). math_private.h has debugging variants of _2sumF() and variants with less restrictions on use. pow() also has to do a multiplication for this. We don't have any macros like _2sumF() to help for this. In general, for real values errors for extra-precision multiplications are unimportant since there are no cancelations. Howver a + b*c might need extra precision since there might be cancelations for the addition. Not a problem for the final step of pow*(), but a problem for the y' = y*log2(x) -n step. Another problem with exp__D() and log__L() is that their API is asymmetric. log__L() returns extra precision but doesn't take it. exp__D() takes extra precision but doesn't return it. Both as needed for pow() and perhaps *gamma*(), but too specialized. General versions were too much slower in C90, but now inline functions can take and return extra precision and hope that the compiler optimizes it away when the caller supplies it as arg.lo = 0 and the callee discards it by not using result.lo. Bruce