Date: Wed, 15 Aug 2012 08:15:52 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Steve Kargl <sgk@troutmask.apl.washington.edu> Cc: Stephen Montgomery-Smith <stephen@missouri.edu>, freebsd-numerics@freebsd.org Subject: Re: Status of expl logl Message-ID: <20120815070807.B3431@besplex.bde.org> In-Reply-To: <20120814195659.GA70571@troutmask.apl.washington.edu> References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> <502AA971.4010403@missouri.edu> <20120814195659.GA70571@troutmask.apl.washington.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 14 Aug 2012, Steve Kargl wrote: > On Tue, Aug 14, 2012 at 02:39:29PM -0500, Stephen Montgomery-Smith wrote: >> On 08/14/2012 01:35 PM, Steve Kargl wrote: >>> On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: >>>> On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: >>>>> Are people working on expl, logl and log1pl? -rw-r--r-- 1 bde wheel 37470 Aug 10 00:08 ld128/s_logl.c -rw-r--r-- 1 bde wheel 32946 Aug 10 00:07 ld80/s_logl.c -rw-r--r-- 1 bde wheel 32056 Aug 10 00:06 s_log.c -rw-r--r-- 1 bde wheel 30053 Aug 10 00:05 s_logf.c These are mostly large comments and larger tables. Each file implements log, log10, log2, log1p for the given precision. Internal accuracy is about 7 extra bits of precision. Speed on core2 with optimal CFLAGS is 30-70 cycles depending on the precision. >> ... >> So I am looking through src/e_pow.c. >> >> It seems to me that the constants L1, L2, L3, etc, are 3/5, 3/7, 3/9, >> etc, but not exactly these constants. So they must have used some >> process where they jiggled the constants around, perhaps using trial and >> error, to get a few extra ulp. Is that right? Hopefully not by trial and error. There is the Remes algorithm. I use a 1000 line (plus infrastructure) pari program to implement a form of this algorithm. In general, for nice functions like cos through not so nice functions like tan near 0, the best possible (minimax) polynomial approximation seems to need about 1 term less than a Taylor approximation, and are only about 1 bit better than a Chebyshev approximation. >> Also, I am trying to see what P1, P2, P3, etc are. They seem to be >> related to the factorial (maybe a power series related to exp(x)), but I >> must admit that I am not getting it. These must be for exp(). Indeed, they are identical with the Pn's in e_exp.c. They are essentially Bernoulli numbers. One generating function for Bernoulli numbers with a certain normalization is z/(exp(z) - 1) := sum(n = 0, Inf, B[n]/n!*z^n) and e_exp.c transforms exp(x) to essentially x/(exp(x) - 1). Applying Remes to them makes Pn not quite a nice fraction. >> Is there a more detailed reference to how these numbers were obtained? >> A paper somewhere? e_exp.c mentions Remes. >> Finally, what is ovfl (in the definition of ovt) meant to be? Something to do with exp's overflow threshold I think. The comment about ovt doesn't seem to match the value (I don't see why the value is so small), but the comment is very similar to the one that I wrote for exp's overflow threshold in ld80/s_expl.c. (fdlibm e_exp.c only gives the magic number for the threshold.) > I haven't looked too closely at the details of pow[fl](). I am > not aware of any published paper that gives the details. AFAIK, > the comment in e_pow.c is only detailed description (other than > the code). I tried to find a paper about pow() implementations > on Sunday with a very cursory google search. Came up empty. > > The L and P constants are used in lines 235 and 299, > line 235: r = s2*s2*(L1+s2*(L2+s2*(L3+s2*(L4+s2*(L5+s2*L6))))); > line 299: t1 = z - t*(P1+t*(P2+t*(P3+t*(P4+t*P5)))); > > These are polynomials that are evaluated via Horner's method. > I suspect the jiggling that you mention is actually a result > of a Remes minimax procedure. > > ovfl looks like it's used to define an overflow threshold (ie, ovt). I hardly looked at e_pow.c before. It is apparently half about repeating e_log.c and e_exp.c, to get at their extra internal precision. In old BSD libm (which FreeBSD used briefly before fdlibm was imported), pow() and some other functions got their extra precision from __log__D() and __exp__D(). These functions are still in msun/bsdsrc and are used in tgamma(). My logl() and Steve's expl() have similar extra precision internally, and run about 4 times faster than the old BSD functions. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120815070807.B3431>