From owner-freebsd-numerics@FreeBSD.ORG  Tue Aug 14 22:16:03 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id BDAE4106564A
	for <freebsd-numerics@freebsd.org>;
	Tue, 14 Aug 2012 22:16:03 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail15.syd.optusnet.com.au (mail15.syd.optusnet.com.au
	[211.29.132.196])
	by mx1.freebsd.org (Postfix) with ESMTP id 5453C8FC17
	for <freebsd-numerics@freebsd.org>;
	Tue, 14 Aug 2012 22:16:02 +0000 (UTC)
Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au
	(c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246])
	by mail15.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q7EMFqoL004566
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 15 Aug 2012 08:15:54 +1000
Date: Wed, 15 Aug 2012 08:15:52 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
In-Reply-To: <20120814195659.GA70571@troutmask.apl.washington.edu>
Message-ID: <20120815070807.B3431@besplex.bde.org>
References: <502A8CCC.5080606@missouri.edu>
	<20120814175257.GA69865@troutmask.apl.washington.edu>
	<20120814183518.GA70092@troutmask.apl.washington.edu>
	<502AA971.4010403@missouri.edu>
	<20120814195659.GA70571@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Stephen Montgomery-Smith <stephen@missouri.edu>,
	freebsd-numerics@freebsd.org
Subject: Re: Status of expl logl
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Aug 2012 22:16:03 -0000

On Tue, 14 Aug 2012, Steve Kargl wrote:

> On Tue, Aug 14, 2012 at 02:39:29PM -0500, Stephen Montgomery-Smith wrote:
>> On 08/14/2012 01:35 PM, Steve Kargl wrote:
>>> On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote:
>>>> On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote:
>>>>> Are people working on expl, logl and log1pl?

-rw-r--r--  1 bde  wheel  37470 Aug 10 00:08 ld128/s_logl.c
-rw-r--r--  1 bde  wheel  32946 Aug 10 00:07 ld80/s_logl.c
-rw-r--r--  1 bde  wheel  32056 Aug 10 00:06 s_log.c
-rw-r--r--  1 bde  wheel  30053 Aug 10 00:05 s_logf.c

These are mostly large comments and larger tables.  Each file implements
log, log10, log2, log1p for the given precision.  Internal accuracy is
about 7 extra bits of precision.  Speed on core2 with optimal CFLAGS is
30-70 cycles depending on the precision.

>> ...
>> So I am looking through src/e_pow.c.
>>
>> It seems to me that the constants L1, L2, L3, etc, are 3/5, 3/7, 3/9,
>> etc, but not exactly these constants.  So they must have used some
>> process where they jiggled the constants around, perhaps using trial and
>> error, to get a few extra ulp.  Is that right?

Hopefully not by trial and error.  There is the Remes algorithm.  I
use a 1000 line (plus infrastructure) pari program to implement a form
of this algorithm.  In general, for nice functions like cos through
not so nice functions like tan near 0, the best possible (minimax)
polynomial approximation seems to need about 1 term less than a Taylor
approximation, and are only about 1 bit better than a Chebyshev
approximation.

>> Also, I am trying to see what P1, P2, P3, etc are.  They seem to be
>> related to the factorial (maybe a power series related to exp(x)), but I
>> must admit that I am not getting it.

These must be for exp().  Indeed, they are identical with the Pn's in
e_exp.c.  They are essentially Bernoulli numbers.  One generating
function for Bernoulli numbers with a certain normalization is

     z/(exp(z) - 1) := sum(n = 0, Inf, B[n]/n!*z^n)

and e_exp.c transforms exp(x) to essentially x/(exp(x) - 1).  Applying
Remes to them makes Pn not quite a nice fraction.

>> Is there a more detailed reference to how these numbers were obtained?
>> A paper somewhere?

e_exp.c mentions Remes.

>> Finally, what is ovfl (in the definition of ovt) meant to be?

Something to do with exp's overflow threshold I think.  The comment
about ovt doesn't seem to match the value (I don't see why the value
is so small), but the comment is very similar to the one that I wrote
for exp's overflow threshold in ld80/s_expl.c.  (fdlibm e_exp.c only
gives the magic number for the threshold.)

> I haven't looked too closely at the details of pow[fl]().  I am
> not aware of any published paper that gives the details. AFAIK,
> the comment in e_pow.c is only detailed description (other than
> the code).  I tried to find a paper about pow() implementations
> on Sunday with a very cursory google search.  Came up empty.
>
> The L and P constants are used in lines 235 and 299,
> line 235:  r = s2*s2*(L1+s2*(L2+s2*(L3+s2*(L4+s2*(L5+s2*L6)))));
> line 299: t1  = z - t*(P1+t*(P2+t*(P3+t*(P4+t*P5))));
>
> These are polynomials that are evaluated via Horner's method.
> I suspect the jiggling that you mention is actually a result
> of a Remes minimax procedure.
>
> ovfl looks like it's used to define an overflow threshold (ie, ovt).

I hardly looked at e_pow.c before.  It is apparently half about repeating
e_log.c and e_exp.c, to get at their extra internal precision.  In old
BSD libm (which FreeBSD used briefly before fdlibm was imported), pow()
and some other functions got their extra precision from __log__D() and
__exp__D().  These functions are still in msun/bsdsrc and are used in
tgamma().  My logl() and Steve's expl() have similar extra precision
internally, and run about 4 times faster than the old BSD functions.

Bruce