From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 00:38:35 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 9347EFFD
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 00:38:35 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 7554B16D0
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 00:38:32 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.9/8.14.9) with ESMTP id t4A0TAdA082271
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO)
 for <freebsd-numerics@freebsd.org>; Sat, 9 May 2015 17:29:10 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.9/8.14.9/Submit) id t4A0TAAm082270
 for freebsd-numerics@freebsd.org; Sat, 9 May 2015 17:29:10 -0700 (PDT)
 (envelope-from sgk)
Date: Sat, 9 May 2015 17:29:10 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: freebsd-numerics@freebsd.org
Subject: small cleanup patch for e_pow.c
Message-ID: <20150510002910.GA82261@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 00:38:35 -0000

In reading, e_pow.c I found a small piece of code that
can be remove.  Anyone object?

Index: src/e_pow.c
===================================================================
--- src/e_pow.c	(revision 1603)
+++ src/e_pow.c	(working copy)
@@ -187,10 +187,6 @@ __ieee754_pow(double x, double y)
 
     /* |y| is huge */
 	if(iy>0x41e00000) { /* if |y| > 2**31 */
-	    if(iy>0x43f00000){	/* if |y| > 2**64, must o/uflow */
-		if(ix<=0x3fefffff) return (hy<0)? huge*huge:tiny*tiny;
-		if(ix>=0x3ff00000) return (hy>0)? huge*huge:tiny*tiny;
-	    }
 	/* over/underflow if x is not close to one */
 	    if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny;
 	    if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny;
-- 
Steve

From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 02:24:06 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 841CF5E4
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 02:24:06 +0000 (UTC)
Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au
 [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 4A7221FA2
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 02:24:05 +0000 (UTC)
Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au
 (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197])
 by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id EC8AF784016;
 Sun, 10 May 2015 12:00:00 +1000 (AEST)
Date: Sun, 10 May 2015 11:59:55 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
In-Reply-To: <20150510002910.GA82261@troutmask.apl.washington.edu>
Message-ID: <20150510113454.O841@besplex.bde.org>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=A5NVYcmG c=1 sm=1 tr=0
 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10
 a=JzwRw_2MAAAA:8 a=LUJW0xp84oLmpHL8iQMA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 02:24:06 -0000

On Sat, 9 May 2015, Steve Kargl wrote:

> In reading, e_pow.c I found a small piece of code that
> can be remove.  Anyone object?
>
> Index: src/e_pow.c
> ===================================================================
> --- src/e_pow.c	(revision 1603)
> +++ src/e_pow.c	(working copy)
> @@ -187,10 +187,6 @@ __ieee754_pow(double x, double y)
>
>     /* |y| is huge */
> 	if(iy>0x41e00000) { /* if |y| > 2**31 */
> -	    if(iy>0x43f00000){	/* if |y| > 2**64, must o/uflow */
> -		if(ix<=0x3fefffff) return (hy<0)? huge*huge:tiny*tiny;
> -		if(ix>=0x3ff00000) return (hy>0)? huge*huge:tiny*tiny;
> -	    }
> 	/* over/underflow if x is not close to one */
> 	    if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny;
> 	    if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny;

It seems to be just an optimization.  It is a large optimization for
the huge args, but those are not common, and is at most a tiny
pessimization for non-huge args (just an extra branch which can be
predicted perfectly if non-huge args are never used).

My tests cover huge args uniformly in float space, so they benefit from
optimizations like this more than normal programs.  However, on some
CPUs the exceptions for calculating huge*huge and tiny*tiny when they
overflow/underflow are the main source of slowness, so it doesn't help
much to avoid large code that has to be executed for normal args, unless
the latter would generate multiple exceptions.  Here if you are correct
that the above code can be removed, then the exceptions generated by the
main path can't be too complicated or it would be hard to see that there
are no spurious ones.  We can see this easily for functions like exp(),
and pow() is not very different, because the algorithm is basically
to calculate a small result and then multiply it by 2**k.  The overflows
and underflows only occur in the scaling step.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 06:15:00 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 7E616CDE
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 06:15:00 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 460E9140F
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 06:15:00 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.9/8.14.9) with ESMTP id t4A6Ex4d083030
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sat, 9 May 2015 23:14:59 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.9/8.14.9/Submit) id t4A6Ewj2083029;
 Sat, 9 May 2015 23:14:58 -0700 (PDT) (envelope-from sgk)
Date: Sat, 9 May 2015 23:14:58 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
Message-ID: <20150510061458.GA82518@troutmask.apl.washington.edu>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
 <20150510113454.O841@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150510113454.O841@besplex.bde.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 06:15:00 -0000

On Sun, May 10, 2015 at 11:59:55AM +1000, Bruce Evans wrote:
> On Sat, 9 May 2015, Steve Kargl wrote:
> 
> > In reading, e_pow.c I found a small piece of code that
> > can be remove.  Anyone object?
> >
> > Index: src/e_pow.c
> > ===================================================================
> > --- src/e_pow.c	(revision 1603)
> > +++ src/e_pow.c	(working copy)
> > @@ -187,10 +187,6 @@ __ieee754_pow(double x, double y)
> >
> >     /* |y| is huge */
> > 	if(iy>0x41e00000) { /* if |y| > 2**31 */
> > -	    if(iy>0x43f00000){	/* if |y| > 2**64, must o/uflow */
> > -		if(ix<=0x3fefffff) return (hy<0)? huge*huge:tiny*tiny;
> > -		if(ix>=0x3ff00000) return (hy>0)? huge*huge:tiny*tiny;
> > -	    }
> > 	/* over/underflow if x is not close to one */
> > 	    if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny;
> > 	    if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny;
> 
> It seems to be just an optimization.  It is a large optimization for
> the huge args, but those are not common, and is at most a tiny
> pessimization for non-huge args (just an extra branch which can be
> predicted perfectly if non-huge args are never used).
> 

I don't see how this can be an optimization.  The code has the form

  if (|y| > 2**31) {
     if (|y| > 2**64) {
         if (a) return
         if (b) return
     }
     if (a') return
     if (b') return
     ...
   }

The difference between a and a' is <= instead of <, and similar for
b and b' with >= and >.  If either a or b would have return, so will
a' and b'.  If neither a nor b return, then neither a' nor b' will
return.  The a' and b' lines were added by das in r141296, where the
commit message says the change is for fdlibm 5.3 compatibility.

I'll also note that block like 'if (|y|>2**64) { }' is not present
in e_powf.c


-- 
Steve

From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 10:44:12 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 0BF6533D
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 10:44:12 +0000 (UTC)
Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au
 [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id C48F81E11
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 10:44:11 +0000 (UTC)
Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au
 (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197])
 by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 7D29D3C50AE;
 Sun, 10 May 2015 20:16:15 +1000 (AEST)
Date: Sun, 10 May 2015 20:16:14 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
In-Reply-To: <20150510061458.GA82518@troutmask.apl.washington.edu>
Message-ID: <20150510172810.E1812@besplex.bde.org>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
 <20150510113454.O841@besplex.bde.org>
 <20150510061458.GA82518@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=A5NVYcmG c=1 sm=1 tr=0
 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10
 a=JzwRw_2MAAAA:8 a=1_pA0lLU-LACsFVV59MA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 10:44:12 -0000

On Sat, 9 May 2015, Steve Kargl wrote:

> On Sun, May 10, 2015 at 11:59:55AM +1000, Bruce Evans wrote:
>> On Sat, 9 May 2015, Steve Kargl wrote:
>>
>>> In reading, e_pow.c I found a small piece of code that
>>> can be remove.  Anyone object?
>>>
>>> Index: src/e_pow.c
>>> ===================================================================
>>> --- src/e_pow.c	(revision 1603)
>>> +++ src/e_pow.c	(working copy)
>>> @@ -187,10 +187,6 @@ __ieee754_pow(double x, double y)
>>>
>>>     /* |y| is huge */
>>> 	if(iy>0x41e00000) { /* if |y| > 2**31 */
>>> -	    if(iy>0x43f00000){	/* if |y| > 2**64, must o/uflow */
>>> -		if(ix<=0x3fefffff) return (hy<0)? huge*huge:tiny*tiny;
>>> -		if(ix>=0x3ff00000) return (hy>0)? huge*huge:tiny*tiny;
>>> -	    }
>>> 	/* over/underflow if x is not close to one */
>>> 	    if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny;
>>> 	    if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny;
>>
>> It seems to be just an optimization.  It is a large optimization for
>> the huge args, but those are not common, and is at most a tiny
>> pessimization for non-huge args (just an extra branch which can be
>> predicted perfectly if non-huge args are never used).
>
> I don't see how this can be an optimization.  The code has the form
>
>  if (|y| > 2**31) {
>     if (|y| > 2**64) {
>         if (a) return
>         if (b) return
>     }
>     if (a') return
>     if (b') return
>     ...
>   }
>
> The difference between a and a' is <= instead of <, and similar for
> b and b' with >= and >.  If either a or b would have return, so will
> a' and b'.  If neither a nor b return, then neither a' nor b' will
> return.  The a' and b' lines were added by das in r141296, where the
> commit message says the change is for fdlibm 5.3 compatibility.

Hmm, I read the comment and not the code.  According to the comment,
there are no tests a and b.  But these are apparently needed.
Obviously, we must return 1 when x is exactly 1, and it is apparently
possible for overflow/underflow to occur for values near 1 even when
y is huge.

> I'll also note that block like 'if (|y|>2**64) { }' is not present
> in e_powf.c

Apparently just another translation error.

The tests for the double precision case are probably fuzzy just for
efficiency -- we want to look at only the high word.  In float
precision, we can look at all the bits.  Since even 1 bit away from 1
is relatively large in float precision, overflow/underflow is more
likely to happen automatically.  Indeed, (1+2**-23)**(2**30) is a large
multiple of FLT_MAX although (1+2**-23)**(2**29) is a small fraction
of FLT_MAX; N >= 2**30 and above also gives underflow for (1-2**-24)**N;
N >= 2**62 gives overflow in double precision for (1+2**-52)**N, ...

So 1 is the only numbers near 1 that doesn't give overflow.

Here is the code again, with explanations:

X 	if(iy>0x41e00000) { /* if |y| > 2**31 */
X 	    if(iy>0x43f00000){	/* if |y| > 2**64, must o/uflow */

The correct fuzzy (power of 2) threshold seems to be >= 2**62.

X 		if(ix<=0x3fefffff) return (hy<0)? huge*huge:tiny*tiny;
X 		if(ix>=0x3ff00000) return (hy>0)? huge*huge:tiny*tiny;

Then we must determine on which side of 1 |x| lies. ix <= 0x3fefffff
clearly imples that |x| < 1.  Otherwise, ix >= 0x3ff00000 and |x| >= 1.
The last inequality is strict since |x| = 1 has already been handled.
The check that ix >= 0x3ff00000 is redundant and should be omitted.
The first check should be written more naturally as ix < 0x3ff00000.

In float precision, the first check can be written even more easily as
ix < 0x3f800000.

X 	    }
X 	/* over/underflow if x is not close to one */
X 	    if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny;
X 	    if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny;

Removing the equal signs from the inequalities makes a signficant
difference.  There are 2**32 values with ix == 0x3fefffff and another
2**32 values with ix == 0x3ff000000.  All of these are close to 1.
Since |iy| is not so large now, |x| has to be signficantly different
from 1 for overflow/underflow to occur.  It probably occurs for many
of the 2 * 2**32 values not found by these tests; the tests are just
quick fuzzy ones using ix.

In float precision, we can find the exact thresholds for overflow
and underflow.  Perhaps the Cygnus thresholds are exact.

das didn't actually add a' and b', but just changed the spelling of sn
in them:

X @@ -192,9 +191,9 @@
X  	    }
X  	/* over/underflow if x is not close to one */
X -	    if(ix<0x3fefffff) return (hy<0)? sn*huge*huge:sn*tiny*tiny;
X -	    if(ix>0x3ff00000) return (hy>0)? sn*huge*huge:sn*tiny*tiny;
X -	/* now |1-x| is tiny <= 2**-20, suffice to compute
X +	    if(ix<0x3fefffff) return (hy<0)? s*huge*huge:s*tiny*tiny;
X +	    if(ix>0x3ff00000) return (hy>0)? s*huge*huge:s*tiny*tiny;
X +	/* now |1-x| is tiny <= 2**-20, suffice to compute 
X  	   log(x) by x-x^2/2+x^3/3-x^4/4 */
X -	    t = ax-1;		/* t has 20 trailing zeros */
X +	    t = ax-one;		/* t has 20 trailing zeros */
X  	    w = (t*t)*(0.5-t*(0.3333333333333333333333-t*0.25));
X  	    u = ivln2_h*t;	/* ivln2_h has 21 sig. bits */

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 19:01:17 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 2344E1D2
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 19:01:17 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id DD8331E70
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 19:01:16 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.9/8.14.9) with ESMTP id t4AJ1FI4085423
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 10 May 2015 12:01:15 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.9/8.14.9/Submit) id t4AJ1FQt085422;
 Sun, 10 May 2015 12:01:15 -0700 (PDT) (envelope-from sgk)
Date: Sun, 10 May 2015 12:01:14 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
Message-ID: <20150510190114.GA85376@troutmask.apl.washington.edu>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
 <20150510113454.O841@besplex.bde.org>
 <20150510061458.GA82518@troutmask.apl.washington.edu>
 <20150510172810.E1812@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150510172810.E1812@besplex.bde.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 19:01:17 -0000

On Sun, May 10, 2015 at 08:16:14PM +1000, Bruce Evans wrote:
> On Sat, 9 May 2015, Steve Kargl wrote:
> >
> > I don't see how this can be an optimization.  The code has the form
> >
> >  if (|y| > 2**31) {
> >     if (|y| > 2**64) {
> >         if (a) return
> >         if (b) return
> >     }
> >     if (a') return
> >     if (b') return
> >     ...
> >   }
> >
> > The difference between a and a' is <= instead of <, and similar for
> > b and b' with >= and >.  If either a or b would have return, so will
> > a' and b'.  If neither a nor b return, then neither a' nor b' will
> > return.  The a' and b' lines were added by das in r141296, where the
> > commit message says the change is for fdlibm 5.3 compatibility.
> 
> Hmm, I read the comment and not the code.  According to the comment,
> there are no tests a and b.  But these are apparently needed.
> Obviously, we must return 1 when x is exactly 1, and it is apparently
> possible for overflow/underflow to occur for values near 1 even when
> y is huge.
> 
> > I'll also note that block like 'if (|y|>2**64) { }' is not present
> > in e_powf.c
> 
> Apparently just another translation error.
> 
> The tests for the double precision case are probably fuzzy just for
> efficiency -- we want to look at only the high word.  In float
> precision, we can look at all the bits.  Since even 1 bit away from 1
> is relatively large in float precision, overflow/underflow is more
> likely to happen automatically.  Indeed, (1+2**-23)**(2**30) is a large
> multiple of FLT_MAX although (1+2**-23)**(2**29) is a small fraction
> of FLT_MAX; N >= 2**30 and above also gives underflow for (1-2**-24)**N;
> N >= 2**62 gives overflow in double precision for (1+2**-52)**N, ...
> 
> So 1 is the only numbers near 1 that doesn't give overflow.
> 

Thanks for the explanation!  That help dislodge a mental block.

To find the magic numbers, it seems I need to consider 

(1-2**(-p))**(2**N) = 2**(emin-p)          for underflow
(1+2**(-p))**(2**N) = (1-2**(-p))*2**emax  for overflow

With p = [24, 53, 64, 113], emin = [-125, -1021, -16381, -16381],
emax = [128, 1024, 16384, 16384], and the use of log(1+z) = z for
|z| << 1, I find

underflow:  N = [30.7, 62.5, 77.5, 126.5]
overflow:   N = [30.5, 62.5, 77.5, 126.5] 

PS: Yes, I'm slowly poking away at powl().  I have the 19 special cases
working for ld80.  Now, comes the hard part.
-- 
Steve

From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 20:37:00 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 268B026D
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 20:37:00 +0000 (UTC)
Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au
 [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id DFB1517F8
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 20:36:59 +0000 (UTC)
Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au
 (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197])
 by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 5C76478419D;
 Mon, 11 May 2015 06:36:55 +1000 (AEST)
Date: Mon, 11 May 2015 06:36:51 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
In-Reply-To: <20150510190114.GA85376@troutmask.apl.washington.edu>
Message-ID: <20150511054753.L1656@besplex.bde.org>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
 <20150510113454.O841@besplex.bde.org>
 <20150510061458.GA82518@troutmask.apl.washington.edu>
 <20150510172810.E1812@besplex.bde.org>
 <20150510190114.GA85376@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=L/MkHYj8 c=1 sm=1 tr=0
 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10
 a=JzwRw_2MAAAA:8 a=6QxUbEKU0hjXAd6ngnwA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 20:37:00 -0000

On Sun, 10 May 2015, Steve Kargl wrote:

> On Sun, May 10, 2015 at 08:16:14PM +1000, Bruce Evans wrote:
>> ...
>> So 1 is the only numbers near 1 that doesn't give overflow.
>
> Thanks for the explanation!  That help dislodge a mental block.
>
> To find the magic numbers, it seems I need to consider
>
> (1-2**(-p))**(2**N) = 2**(emin-p)          for underflow
> (1+2**(-p))**(2**N) = (1-2**(-p))*2**emax  for overflow
>
> With p = [24, 53, 64, 113], emin = [-125, -1021, -16381, -16381],
> emax = [128, 1024, 16384, 16384], and the use of log(1+z) = z for
> |z| << 1, I find
>
> underflow:  N = [30.7, 62.5, 77.5, 126.5]
> overflow:   N = [30.5, 62.5, 77.5, 126.5]

I plugged some numbers into pari to get sloppy estimates.  E.g.,
(1 + 2^-23.0)^(2^N) / 2^128; then bump up N until the result is > 1.
I get this at N = 29.5, so 30.7 seems too high.

> PS: Yes, I'm slowly poking away at powl().  I have the 19 special cases
> working for ld80.  Now, comes the hard part.

Excellent.

The only thing I would change immediately in the fdlibm implementation is
its manually inlined exp().  Use the exp*() kernel instead.  Hopefully
the kernel used for hyperbolic functions is also suitable for pow().
This kernel has only been committed for long double precision.  Look at
the old BSD libm for use of other exp() kernels and their use by pow().
It has 2 kernels: exp__D(), which is used by pow(), erf() and [t]gamma(),
and exp__E(), which is used by cosh() and expm1().  We still have
exp__D() (renamed with more underscores) in bsdsrc for use in tgamma().
A general kernel would take an extra-precision arg and return an extra-
precision result, but that would be inefficient and these kernels only
support an extra-precision arg.  I don't see how that is useful for
expm1(), and my kernel for exp*() only gives extra precision for the
result, as needed for hyperbolic functions.

I didn't notice before that old BSD libm used an exp() kernel for
cosh().  Ng (or other Sun hackers?) apparently dumbed down his
implementation of cosh() a bit for the version in fdlibm, so FreeBSD's
cosh() is probably still less accurate than the old (1985) version
:-(.  This is fixed in my version, but only the long double version
of the fixes has been committed.  The accuracy of cosh() was presumably
better than 1 ulp in 1985; it is 2 ulps (maybe 1.5) in FreeBSD, and
about 0.9 ulps in my version (the kernel for expl() is more accurate,
so coshl( has an accuracy of closer to 0.5 ulps).  OTOH, pow() may
have been improved over the 1985 version in fdlibm.  It manually inlines
kernels for both exp() and log() (named exp__D() and log__L(), and
these kernels seem to be accurate enough -- I haven't found any errors
larger than 1 ulp in pow().  The manual inlining makes the code harder
to read (about 50% larger, with many more special constants for the
kernels).

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 20:59:48 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 7BD238D6
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 20:59:48 +0000 (UTC)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 5AF901A4B
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 20:59:48 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.9/8.14.9) with ESMTP id t4AKxkpl085971
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 10 May 2015 13:59:46 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.9/8.14.9/Submit) id t4AKxkQS085970;
 Sun, 10 May 2015 13:59:46 -0700 (PDT) (envelope-from sgk)
Date: Sun, 10 May 2015 13:59:46 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
Message-ID: <20150510205946.GA85935@troutmask.apl.washington.edu>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
 <20150510113454.O841@besplex.bde.org>
 <20150510061458.GA82518@troutmask.apl.washington.edu>
 <20150510172810.E1812@besplex.bde.org>
 <20150510190114.GA85376@troutmask.apl.washington.edu>
 <20150511054753.L1656@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150511054753.L1656@besplex.bde.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 20:59:48 -0000

On Mon, May 11, 2015 at 06:36:51AM +1000, Bruce Evans wrote:
> On Sun, 10 May 2015, Steve Kargl wrote:
> 
> > On Sun, May 10, 2015 at 08:16:14PM +1000, Bruce Evans wrote:
> >> ...
> >> So 1 is the only numbers near 1 that doesn't give overflow.
> >
> > Thanks for the explanation!  That help dislodge a mental block.
> >
> > To find the magic numbers, it seems I need to consider
> >
> > (1-2**(-p))**(2**N) = 2**(emin-p)          for underflow
> > (1+2**(-p))**(2**N) = (1-2**(-p))*2**emax  for overflow
> >
> > With p = [24, 53, 64, 113], emin = [-125, -1021, -16381, -16381],
> > emax = [128, 1024, 16384, 16384], and the use of log(1+z) = z for
> > |z| << 1, I find
> >
> > underflow:  N = [30.7, 62.5, 77.5, 126.5]
> > overflow:   N = [30.5, 62.5, 77.5, 126.5]
> 
> I plugged some numbers into pari to get sloppy estimates.  E.g.,
> (1 + 2^-23.0)^(2^N) / 2^128; then bump up N until the result is > 1.
> I get this at N = 29.5, so 30.7 seems too high.
> 

I used p = 24 in my (1+-2^(-p)).  It seems that you're using p-1=23.

(1-2**(-p))**(2**N) = 2**(emin-p)

Using log() and log(1+z) = z for |z| << 1, the above equation gives

N = p + log((p-emin) * log(2)) / log(2)

where the equality is of course approximation.

p = 24 --> N = 30.690
p = 23 --> N = 29.681

-- 
Steve

From owner-freebsd-numerics@FreeBSD.ORG  Sun May 10 23:09:55 2015
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 18B28829
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 23:09:55 +0000 (UTC)
Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au
 [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id CFA0D177D
 for <freebsd-numerics@freebsd.org>; Sun, 10 May 2015 23:09:53 +0000 (UTC)
Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au
 (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197])
 by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id DA60C3C72E0;
 Mon, 11 May 2015 09:09:50 +1000 (AEST)
Date: Mon, 11 May 2015 09:09:46 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: small cleanup patch for e_pow.c
In-Reply-To: <20150510205946.GA85935@troutmask.apl.washington.edu>
Message-ID: <20150511084012.R2314@besplex.bde.org>
References: <20150510002910.GA82261@troutmask.apl.washington.edu>
 <20150510113454.O841@besplex.bde.org>
 <20150510061458.GA82518@troutmask.apl.washington.edu>
 <20150510172810.E1812@besplex.bde.org>
 <20150510190114.GA85376@troutmask.apl.washington.edu>
 <20150511054753.L1656@besplex.bde.org>
 <20150510205946.GA85935@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=A5NVYcmG c=1 sm=1 tr=0
 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10
 a=JzwRw_2MAAAA:8 a=MDOUD4c6SDqqUN_XlSwA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 10 May 2015 23:09:55 -0000

On Sun, 10 May 2015, Steve Kargl wrote:

> On Mon, May 11, 2015 at 06:36:51AM +1000, Bruce Evans wrote:
>> On Sun, 10 May 2015, Steve Kargl wrote:
>>>
>>> To find the magic numbers, it seems I need to consider
>>>
>>> (1-2**(-p))**(2**N) = 2**(emin-p)          for underflow
>>> (1+2**(-p))**(2**N) = (1-2**(-p))*2**emax  for overflow
>>>
>>> With p = [24, 53, 64, 113], emin = [-125, -1021, -16381, -16381],
>>> emax = [128, 1024, 16384, 16384], and the use of log(1+z) = z for
>>> |z| << 1, I find
>>>
>>> underflow:  N = [30.7, 62.5, 77.5, 126.5]
>>> overflow:   N = [30.5, 62.5, 77.5, 126.5]
>>
>> I plugged some numbers into pari to get sloppy estimates.  E.g.,
>> (1 + 2^-23.0)^(2^N) / 2^128; then bump up N until the result is > 1.
>> I get this at N = 29.5, so 30.7 seems too high.
>
> I used p = 24 in my (1+-2^(-p)).  It seems that you're using p-1=23.

It should be 1+2**-(p-1) and 1-2**-p, but the 1-epsilon side dominates;
also 2**128 is sloppy with gradual underflow; in the scale factor for
the 1-epsilon side, 128 must be replaced by something like 150.  So
30.7 is correct if we don't bother to treat the sides differently.

There is a related problem with extra precision.  Expressions like
huge*huge and tiny*tiny are always wrong in the presence of extra
precision/exponent range, since they don't overflow or underflow
but just give a wrong result if extra exponent range actually works.
But extra precision/exponent is often broken:
- C11 broke it and also broke efficiency by requiring extra precision/
   exponent range to be destroyed on function return.  clang doesn't
   implement this breakage, but gcc-4.8 does (with -std=c99 but not
   with -std=gnu99)
- at least old versions of clang miscompile expressions like huge*huge
   unless huge is declared with the volatile hack.  The C11 breakage
   makes returning the value INFINITY correct and then the only error
   in evaluating this at compile time is not raising the overflow
   exception.
- at least old versons of both clang and gcc miscompile expressions
   like tiny*tiny, except...

Functions like exp() are especially affected by this bug.  cexp() is
most noticeably broken by it, since it uses expressions like
exp(x) * cos(y), written otherwise carefully by scaling exp(x) so
as to avoid spurious overflow, so some non-overflowing cases return
garbage related to huge*huge in extra precision.

We mostly don't worry about this bug, except I fixed it for cexp*().
My test don't see it, since they assign function results to a variable
without the extra precision, so as to destroy it like the C11 breakage
requires.  The bug should be fixed someday by replacing code like
'return huge*huge;' with 'return Inf_with_OE();'.  Cases where the
expression is actually x*x where x is known to be >= huge are harder
to fix.

Bruce