From owner-freebsd-numerics@freebsd.org  Mon Apr 17 00:18:23 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B0AEAD41E55;
 Mon, 17 Apr 2017 00:18:23 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9D64B1EFA;
 Mon, 17 Apr 2017 00:18:23 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id v3H0IG66010285
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 16 Apr 2017 17:18:16 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id v3H0IGwt010284;
 Sun, 16 Apr 2017 17:18:16 -0700 (PDT) (envelope-from sgk)
Date: Sun, 16 Apr 2017 17:18:16 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: freebsd-numerics@freebsd.org, freebsd-hackers@freebsd.org,
 freebsd-standards@freebsd.org
Subject: [PATCH] Fixes for asin[fl].
Message-ID: <20170417001816.GA10277@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.7.2 (2016-11-26)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2017 00:18:23 -0000

Please commit.

* libm/msun/src/e_asin.c:
  . Use integer literal constants where possible.  This reduces diffs
    between e_asin[fl].c.
  . Remove a spurious tab in front of a single line comment.
  . Sort declaration.
  . Remove initialization of 't' in declaration statement.  It's not needed.

* libm/msun/src/e_asinf.c:
  . Use integer literal constants where possible.  This reduces diffs
    between e_asin[fl].c.
  . Remove a spurious tab in front of a single line comment.
  . Sort declaration.
  . Use sqrtf(x) instead of sqrt(x).  The additional precision in the
    result of sqrt(x) is not required.

* libm/msun/src/e_asinl.c:
  . Remove trailing space in Sunsoft copyright statement.
  . Use integer literal constants where possible.  This reduces diffs
    between e_asin[fl].c.
  . Remove a spurious tab in front of a single line comment.
  . Sort declaration.
  . Remove initialization of 't' in declaration statement.  It's not needed.

Reviewed by: md5

Index: src/e_asin.c
===================================================================
--- src/e_asin.c	(revision 1904)
+++ src/e_asin.c	(working copy)
@@ -50,12 +50,13 @@ __FBSDID("$FreeBSD: head/lib/msun/src/e_
 #include "math_private.h"
 
 static const double
-one =  1.00000000000000000000e+00, /* 0x3FF00000, 0x00000000 */
 huge =  1.000e+300,
 pio2_hi =  1.57079632679489655800e+00, /* 0x3FF921FB, 0x54442D18 */
 pio2_lo =  6.12323399573676603587e-17, /* 0x3C91A626, 0x33145C07 */
-pio4_hi =  7.85398163397448278999e-01, /* 0x3FE921FB, 0x54442D18 */
-	/* coefficient for R(x^2) */
+pio4_hi =  7.85398163397448278999e-01; /* 0x3FE921FB, 0x54442D18 */
+
+/* coefficient for R(x^2) */
+static const double
 pS0 =  1.66666666666666657415e-01, /* 0x3FC55555, 0x55555555 */
 pS1 = -3.25565818622400915405e-01, /* 0xBFD4D612, 0x03EB6F7D */
 pS2 =  2.01212532134862925881e-01, /* 0x3FC9C155, 0x0E884455 */
@@ -70,7 +71,7 @@ qS4 =  7.70381505559019352791e-02; /* 0x
 double
 __ieee754_asin(double x)
 {
-	double t=0.0,w,p,q,c,r,s;
+	double c, p, q, r, s, t, w;
 	int32_t hx,ix;
 	GET_HIGH_WORD(hx,x);
 	ix = hx&0x7fffffff;
@@ -83,30 +84,30 @@ __ieee754_asin(double x)
 	    return (x-x)/(x-x);		/* asin(|x|>1) is NaN */   
 	} else if (ix<0x3fe00000) {	/* |x|<0.5 */
 	    if(ix<0x3e500000) {		/* if |x| < 2**-26 */
-		if(huge+x>one) return x;/* return x with inexact if x!=0*/
+		if(huge+x>1) return x;/* return x with inexact if x!=0*/
 	    }
 	    t = x*x;
 	    p = t*(pS0+t*(pS1+t*(pS2+t*(pS3+t*(pS4+t*pS5)))));
-	    q = one+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
+	    q = 1+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
 	    w = p/q;
 	    return x+x*w;
 	}
 	/* 1> |x|>= 0.5 */
-	w = one-fabs(x);
-	t = w*0.5;
+	w = 1-fabs(x);
+	t = w/2;
 	p = t*(pS0+t*(pS1+t*(pS2+t*(pS3+t*(pS4+t*pS5)))));
-	q = one+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
+	q = 1+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
 	s = sqrt(t);
 	if(ix>=0x3FEF3333) { 	/* if |x| > 0.975 */
 	    w = p/q;
-	    t = pio2_hi-(2.0*(s+s*w)-pio2_lo);
+	    t = pio2_hi-(2*(s+s*w)-pio2_lo);
 	} else {
 	    w  = s;
 	    SET_LOW_WORD(w,0);
 	    c  = (t-w*w)/(s+w);
 	    r  = p/q;
-	    p  = 2.0*s*r-(pio2_lo-2.0*c);
-	    q  = pio4_hi-2.0*w;
+	    p  = 2*s*r-(pio2_lo-2*c);
+	    q  = pio4_hi-2*w;
 	    t  = pio4_hi-(p-q);
 	}    
 	if(hx>0) return t; else return -t;    
Index: src/e_asinf.c
===================================================================
--- src/e_asinf.c	(revision 1904)
+++ src/e_asinf.c	(working copy)
@@ -20,9 +20,10 @@ __FBSDID("$FreeBSD: head/lib/msun/src/e_
 #include "math_private.h"
 
 static const float
-one =  1.0000000000e+00, /* 0x3F800000 */
-huge =  1.000e+30,
-	/* coefficient for R(x^2) */
+huge =  1.000e+30;
+
+/* coefficient for R(x^2) */
+static const float
 pS0 =  1.6666586697e-01,
 pS1 = -4.2743422091e-02,
 pS2 = -8.6563630030e-03,
@@ -35,7 +36,7 @@ float
 __ieee754_asinf(float x)
 {
 	double s;
-	float t,w,p,q;
+	float p, q, t, w;
 	int32_t hx,ix;
 	GET_FLOAT_WORD(hx,x);
 	ix = hx&0x7fffffff;
@@ -45,21 +46,21 @@ __ieee754_asinf(float x)
 	    return (x-x)/(x-x);		/* asin(|x|>1) is NaN */
 	} else if (ix<0x3f000000) {	/* |x|<0.5 */
 	    if(ix<0x39800000) {		/* |x| < 2**-12 */
-		if(huge+x>one) return x;/* return x with inexact if x!=0*/
+		if(huge+x>1) return x;/* return x with inexact if x!=0*/
 	    }
 	    t = x*x;
 	    p = t*(pS0+t*(pS1+t*pS2));
-	    q = one+t*qS1;
+	    q = 1+t*qS1;
 	    w = p/q;
 	    return x+x*w;
 	}
 	/* 1> |x|>= 0.5 */
-	w = one-fabsf(x);
-	t = w*(float)0.5;
+	w = 1-fabsf(x);
+	t = w/2;
 	p = t*(pS0+t*(pS1+t*pS2));
-	q = one+t*qS1;
-	s = sqrt(t);
+	q = 1+t*qS1;
+	s = sqrtf(t);
 	w = p/q;
-	t = pio2-2.0*(s+s*w);
+	t = pio2-2*(s+s*w);
 	if(hx>0) return t; else return -t;
 }
Index: src/e_asinl.c
===================================================================
--- src/e_asinl.c	(revision 1904)
+++ src/e_asinl.c	(working copy)
@@ -1,4 +1,3 @@
-
 /* @(#)e_asin.c 1.3 95/01/18 */
 /* FreeBSD: head/lib/msun/src/e_asin.c 176451 2008-02-22 02:30:36Z das */
 /*
@@ -7,7 +6,7 @@
  *
  * Developed at SunSoft, a Sun Microsystems, Inc. business.
  * Permission to use, copy, modify, and distribute this
- * software is freely granted, provided that this notice 
+ * software is freely granted, provided that this notice
  * is preserved.
  * ====================================================
  */
@@ -27,14 +26,13 @@ __FBSDID("$FreeBSD: head/lib/msun/src/e_
 #include "math_private.h"
 
 static const long double
-one =  1.00000000000000000000e+00,
 huge = 1.000e+300;
 
 long double
 asinl(long double x)
 {
 	union IEEEl2bits u;
-	long double t=0.0,w,p,q,c,r,s;
+	long double c, p, q, r, s, t, w;
 	int16_t expsign, expt;
 	u.e = x;
 	expsign = u.xbits.expsign;
@@ -46,7 +44,7 @@ asinl(long double x)
 	    return (x-x)/(x-x);		/* asin(|x|>1) is NaN */   
 	} else if (expt<BIAS-1) {	/* |x|<0.5 */
 	    if(expt<ASIN_LINEAR) {	/* if |x| is small, asinl(x)=x */
-		if(huge+x>one) return x;/* return x with inexact if x!=0*/
+		if(huge+x>1) return x;/* return x with inexact if x!=0*/
 	    }
 	    t = x*x;
 	    p = P(t);
@@ -55,22 +53,22 @@ asinl(long double x)
 	    return x+x*w;
 	}
 	/* 1> |x|>= 0.5 */
-	w = one-fabsl(x);
-	t = w*0.5;
+	w = 1-fabsl(x);
+	t = w/2;
 	p = P(t);
 	q = Q(t);
 	s = sqrtl(t);
 	if(u.bits.manh>=THRESH) { 	/* if |x| is close to 1 */
 	    w = p/q;
-	    t = pio2_hi-(2.0*(s+s*w)-pio2_lo);
+	    t = pio2_hi-(2*(s+s*w)-pio2_lo);
 	} else {
 	    u.e = s;
 	    u.bits.manl = 0;
 	    w = u.e;
 	    c  = (t-w*w)/(s+w);
 	    r  = p/q;
-	    p  = 2.0*s*r-(pio2_lo-2.0*c);
-	    q  = pio4_hi-2.0*w;
+	    p  = 2*s*r-(pio2_lo-2*c);
+	    q  = pio4_hi-2*w;
 	    t  = pio4_hi-(p-q);
 	}    
 	if(expsign>0) return t; else return -t;    

-- 
Steve
20161221 https://www.youtube.com/watch?v=IbCHE-hONow

From owner-freebsd-numerics@freebsd.org  Mon Apr 17 04:08:12 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id F1CC8D41623;
 Mon, 17 Apr 2017 04:08:11 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au
 [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 767A487B;
 Mon, 17 Apr 2017 04:08:10 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from besplex.bde.org (c122-106-153-191.carlnfd1.nsw.optusnet.com.au
 [122.106.153.191])
 by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id C1FE010507A;
 Mon, 17 Apr 2017 14:08:02 +1000 (AEST)
Date: Mon, 17 Apr 2017 14:07:57 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org, freebsd-hackers@freebsd.org, 
 freebsd-standards@freebsd.org
Subject: Re: [PATCH] Fixes for asin[fl].
In-Reply-To: <20170417001816.GA10277@troutmask.apl.washington.edu>
Message-ID: <20170417134317.E931@besplex.bde.org>
References: <20170417001816.GA10277@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=AYLBJzfG c=1 sm=1 tr=0
 a=Tj3pCpwHnMupdyZSltBt7Q==:117 a=Tj3pCpwHnMupdyZSltBt7Q==:17
 a=kj9zAlcOel0A:10 a=bSSs6WrcAUF_AD4-SooA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2017 04:08:12 -0000

On Sun, 16 Apr 2017, Steve Kargl wrote:

> Please commit.
>
> * libm/msun/src/e_asin.c:
>  . Use integer literal constants where possible.  This reduces diffs
>    between e_asin[fl].c.
>  . Remove a spurious tab in front of a single line comment.
>  . Sort declaration.
>  . Remove initialization of 't' in declaration statement.  It's not needed.

This is mostly backwards.  Don't make style changes away from fdlibm.
fdlibm only has e_asin.c, and cygnus' e_asinf.c might have gratuitous
differences, and our e_asinl.c is too different due to its hiding of
the polynomial coeeficients and evaluation in invtrig.[ch] (there are
many style bugs and small errors in the coefficients hidden in the
latter).

> Index: src/e_asin.c
> ===================================================================
> --- src/e_asin.c	(revision 1904)
> +++ src/e_asin.c	(working copy)
> @@ -50,12 +50,13 @@ __FBSDID("$FreeBSD: head/lib/msun/src/e_
> #include "math_private.h"
>
> static const double
> -one =  1.00000000000000000000e+00, /* 0x3FF00000, 0x00000000 */

Spelling 1 as "one" is standard style in fdlibm.  If fdlibm supported
float precision, then it whould do this even more, since this allows
spelling all the float constants without an F, as needed to avoid them
being double constants.

> @@ -70,7 +71,7 @@ qS4 =  7.70381505559019352791e-02; /* 0x
> double
> __ieee754_asin(double x)
> {
> -	double t=0.0,w,p,q,c,r,s;
> +	double c, p, q, r, s, t, w;
> 	int32_t hx,ix;
> 	GET_HIGH_WORD(hx,x);
> 	ix = hx&0x7fffffff;

Removing t is correct.

> @@ -83,30 +84,30 @@ __ieee754_asin(double x)
> 	    return (x-x)/(x-x);		/* asin(|x|>1) is NaN */
> 	} else if (ix<0x3fe00000) {	/* |x|<0.5 */
> 	    if(ix<0x3e500000) {		/* if |x| < 2**-26 */
> -		if(huge+x>one) return x;/* return x with inexact if x!=0*/
> +		if(huge+x>1) return x;/* return x with inexact if x!=0*/
> 	    }
> 	    t = x*x;
> 	    p = t*(pS0+t*(pS1+t*(pS2+t*(pS3+t*(pS4+t*pS5)))));
> -	    q = one+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
> +	    q = 1+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
> 	    w = p/q;
> 	    return x+x*w;
> 	}
> 	/* 1> |x|>= 0.5 */
> -	w = one-fabs(x);
> -	t = w*0.5;
> +	w = 1-fabs(x);
> +	t = w/2;

It was bad to spell 1 as one, but 2 as 2.

> 	p = t*(pS0+t*(pS1+t*(pS2+t*(pS3+t*(pS4+t*pS5)))));
> -	q = one+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
> +	q = 1+t*(qS1+t*(qS2+t*(qS3+t*qS4)));

This still uses a slow rational function approximation and evaluates it
slowly using Horner's method.  I have only fixed this in my float version.
The polynomial has terms out to S11, and would need too many terms for
long doubles.  But so does the rational function.

> 	s = sqrt(t);
> 	if(ix>=0x3FEF3333) { 	/* if |x| > 0.975 */
> 	    w = p/q;
> -	    t = pio2_hi-(2.0*(s+s*w)-pio2_lo);
> +	    t = pio2_hi-(2*(s+s*w)-pio2_lo);
> 	} else {
> 	    w  = s;
> 	    SET_LOW_WORD(w,0);
> 	    c  = (t-w*w)/(s+w);
> 	    r  = p/q;

The rational function is not all that slow, since there is a sqrt() as
well as a division in some cases, but the division limits accuracy.  So
does the sqrt().  The float version cheats by using double for sqrt().

Bruce

From owner-freebsd-numerics@freebsd.org  Mon Apr 17 04:57:28 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B15A9D40956;
 Mon, 17 Apr 2017 04:57:28 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9B94E998;
 Mon, 17 Apr 2017 04:57:28 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id v3H4vRAr011314
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 16 Apr 2017 21:57:27 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id v3H4vPFk011313;
 Sun, 16 Apr 2017 21:57:25 -0700 (PDT) (envelope-from sgk)
Date: Sun, 16 Apr 2017 21:57:25 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org, freebsd-hackers@freebsd.org,
 freebsd-standards@freebsd.org
Subject: Re: [PATCH] Fixes for asin[fl].
Message-ID: <20170417045725.GA11263@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20170417001816.GA10277@troutmask.apl.washington.edu>
 <20170417134317.E931@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170417134317.E931@besplex.bde.org>
User-Agent: Mutt/1.7.2 (2016-11-26)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2017 04:57:28 -0000

On Mon, Apr 17, 2017 at 02:07:57PM +1000, Bruce Evans wrote:
> On Sun, 16 Apr 2017, Steve Kargl wrote:
> 
> > Please commit.
> >
> > * libm/msun/src/e_asin.c:
> >  . Use integer literal constants where possible.  This reduces diffs
> >    between e_asin[fl].c.
> >  . Remove a spurious tab in front of a single line comment.
> >  . Sort declaration.
> >  . Remove initialization of 't' in declaration statement.  It's not needed.
> 
> This is mostly backwards.  Don't make style changes away from fdlibm.
> fdlibm only has e_asin.c, and cygnus' e_asinf.c might have gratuitous
> differences, and our e_asinl.c is too different due to its hiding of
> the polynomial coeeficients and evaluation in invtrig.[ch] (there are
> many style bugs and small errors in the coefficients hidden in the
> latter).

fdlibm will never be updated.  So whatever fdlibm does/did 
is irrelevant.  For e_asinf.c, like most cygnus attributed
code, the conversion from double to float is, ahem, rather
questionable.  As for e_asinl.c, I have doubts to its 
suitability to ld128.

> > Index: src/e_asin.c
> > ===================================================================
> > --- src/e_asin.c	(revision 1904)
> > +++ src/e_asin.c	(working copy)
> > @@ -50,12 +50,13 @@ __FBSDID("$FreeBSD: head/lib/msun/src/e_
> > #include "math_private.h"
> >
> > static const double
> > -one =  1.00000000000000000000e+00, /* 0x3FF00000, 0x00000000 */
> 
> Spelling 1 as "one" is standard style in fdlibm.  If fdlibm supported
> float precision, then it whould do this even more, since this allows
> spelling all the float constants without an F, as needed to avoid them
> being double constants.

Most compilers for the last 20 years know how to convert 1 to
float, double, and long double.  This is in line with getting
rid of (float)0.5 type code. 


> > @@ -70,7 +71,7 @@ qS4 =  7.70381505559019352791e-02; /* 0x
> > double
> > __ieee754_asin(double x)
> > {
> > -	double t=0.0,w,p,q,c,r,s;
> > +	double c, p, q, r, s, t, w;
> > 	int32_t hx,ix;
> > 	GET_HIGH_WORD(hx,x);
> > 	ix = hx&0x7fffffff;
> 
> Removing t is correct.
> 
> > @@ -83,30 +84,30 @@ __ieee754_asin(double x)
> > 	    return (x-x)/(x-x);		/* asin(|x|>1) is NaN */
> > 	} else if (ix<0x3fe00000) {	/* |x|<0.5 */
> > 	    if(ix<0x3e500000) {		/* if |x| < 2**-26 */
> > -		if(huge+x>one) return x;/* return x with inexact if x!=0*/
> > +		if(huge+x>1) return x;/* return x with inexact if x!=0*/
> > 	    }
> > 	    t = x*x;
> > 	    p = t*(pS0+t*(pS1+t*(pS2+t*(pS3+t*(pS4+t*pS5)))));
> > -	    q = one+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
> > +	    q = 1+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
> > 	    w = p/q;
> > 	    return x+x*w;
> > 	}
> > 	/* 1> |x|>= 0.5 */
> > -	w = one-fabs(x);
> > -	t = w*0.5;
> > +	w = 1-fabs(x);
> > +	t = w/2;
> 
> It was bad to spell 1 as one, but 2 as 2.

?

> > 	p = t*(pS0+t*(pS1+t*(pS2+t*(pS3+t*(pS4+t*pS5)))));
> > -	q = one+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
> > +	q = 1+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
> 
> This still uses a slow rational function approximation and evaluates it
> slowly using Horner's method.  I have only fixed this in my float version.
> The polynomial has terms out to S11, and would need too many terms for
> long doubles.  But so does the rational function.

I'll get to that problem later.  See my sinpi[fl] inplementations.
One does not need a rational approximation (at least with sinpi).

> > 	s = sqrt(t);
> > 	if(ix>=0x3FEF3333) { 	/* if |x| > 0.975 */
> > 	    w = p/q;
> > -	    t = pio2_hi-(2.0*(s+s*w)-pio2_lo);
> > +	    t = pio2_hi-(2*(s+s*w)-pio2_lo);
> > 	} else {
> > 	    w  = s;
> > 	    SET_LOW_WORD(w,0);
> > 	    c  = (t-w*w)/(s+w);
> > 	    r  = p/q;
> 
> The rational function is not all that slow, since there is a sqrt() as
> well as a division in some cases, but the division limits accuracy.  So
> does the sqrt().  The float version cheats by using double for sqrt().

If you look at my patch, you'll see that the sqrt() in the 
float version has been replaced with sqrtf().  The float version
cheats by using double in a few places.  I think I know how to
fix that issue.

I also think that sqrt[fl] can be replaced by an in-line specialized
sqrt to avoid the overhead of sqrt[fl].

-- 
Steve
20161221 https://www.youtube.com/watch?v=IbCHE-hONow

From owner-freebsd-numerics@freebsd.org  Mon Apr 17 09:24:58 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4B081D41F8F;
 Mon, 17 Apr 2017 09:24:58 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au
 [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id EDE20A7D;
 Mon, 17 Apr 2017 09:24:57 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from besplex.bde.org (c122-106-153-191.carlnfd1.nsw.optusnet.com.au
 [122.106.153.191])
 by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 1C963101DD7;
 Mon, 17 Apr 2017 19:24:53 +1000 (AEST)
Date: Mon, 17 Apr 2017 19:24:50 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: Bruce Evans <brde@optusnet.com.au>, freebsd-numerics@freebsd.org, 
 freebsd-hackers@freebsd.org, freebsd-standards@freebsd.org
Subject: Re: [PATCH] Fixes for asin[fl].
In-Reply-To: <20170417045725.GA11263@troutmask.apl.washington.edu>
Message-ID: <20170417182217.V1740@besplex.bde.org>
References: <20170417001816.GA10277@troutmask.apl.washington.edu>
 <20170417134317.E931@besplex.bde.org>
 <20170417045725.GA11263@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=KeqiiUQD c=1 sm=1 tr=0
 a=Tj3pCpwHnMupdyZSltBt7Q==:117 a=Tj3pCpwHnMupdyZSltBt7Q==:17
 a=kj9zAlcOel0A:10 a=-nA7janlvo-EyaNZGOsA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2017 09:24:58 -0000

On Sun, 16 Apr 2017, Steve Kargl wrote:

> On Mon, Apr 17, 2017 at 02:07:57PM +1000, Bruce Evans wrote:
>> On Sun, 16 Apr 2017, Steve Kargl wrote:
>>
>>> Please commit.
>>>
>>> * libm/msun/src/e_asin.c:
>>>  . Use integer literal constants where possible.  This reduces diffs
>>>    between e_asin[fl].c.
>>>  . Remove a spurious tab in front of a single line comment.
>>>  . Sort declaration.
>>>  . Remove initialization of 't' in declaration statement.  It's not needed.
>>
>> This is mostly backwards.  Don't make style changes away from fdlibm.
>> fdlibm only has e_asin.c, and cygnus' e_asinf.c might have gratuitous
>> differences, and our e_asinl.c is too different due to its hiding of
>> the polynomial coeeficients and evaluation in invtrig.[ch] (there are
>> many style bugs and small errors in the coefficients hidden in the
>> latter).
>
> fdlibm will never be updated.  So whatever fdlibm does/did
> is irrelevant.  For e_asinf.c, like most cygnus attributed
> code, the conversion from double to float is, ahem, rather
> questionable.  As for e_asinl.c, I have doubts to its
> suitability to ld128.

Predicting the future is not easy.  Consider my version as the
reference if not fdlibm.  Cygnus did a not so bad job in the
apparently short time available.  They didn't increase the
maintainence problem with cosmetic changes, perhaps for the smae
portability reasons.  I haven't tested e_asinl.c much.  e_atanl.c is
more of a problem.  This fails some accuracy tests even with ld80.
Otherwise, the inverse trig (non-hyperbolic) functions seem to be
easier to evaluate accurately than most functions -- errors for
float and double precision are somehow less than 1 ulp.

>>> Index: src/e_asin.c
>>> ===================================================================
>>> --- src/e_asin.c	(revision 1904)
>>> +++ src/e_asin.c	(working copy)
>>> @@ -50,12 +50,13 @@ __FBSDID("$FreeBSD: head/lib/msun/src/e_
>>> #include "math_private.h"
>>>
>>> static const double
>>> -one =  1.00000000000000000000e+00, /* 0x3FF00000, 0x00000000 */
>>
>> Spelling 1 as "one" is standard style in fdlibm.  If fdlibm supported
>> float precision, then it whould do this even more, since this allows
>> spelling all the float constants without an F, as needed to avoid them
>> being double constants.
>
> Most compilers for the last 20 years know how to convert 1 to
> float, double, and long double.  This is in line with getting
> rid of (float)0.5 type code.

I think that is my idea :-).  I only use it in new code.

>>> 	/* 1> |x|>= 0.5 */
>>> -	w = one-fabs(x);
>>> -	t = w*0.5;
>>> +	w = 1-fabs(x);
>>> +	t = w/2;
>>
>> It was bad to spell 1 as one, but 2 as 2.
>
> ?

'one' was used for some technical or stylistic reason.  Perhaps just
to ensure that the compiler doesn't mistranslate it.  Here 2 was used
inverted as 0.5.  There are even more reasons to spell this 0.5 as
'half' as is done in some places.

It takes knowing a lot about the C standard and compilers to know that
the compiler will first promote 2 to (type)2.0 with the correct type,
and then strength-reduce the multiplication to division iff that is
good.  I think the unambiguous promotion rule was new in C90.  Too
new for fdlibm a couple of years later.  The strength reduction is
only correct if either the floating point is fuzzy or if it is as
in Annex F or some other extension, and the result doesn't depend
on the method in any rounding mode or if a pragma allows fuzziness.
I think the result doesn't depend on the method in any rounding mode
in IEEE754 arithmetic.

We use this simplication in new code though it isn't portable.  Anyone
wanting portable code would have to change back to multiplying by 0.5.

>>> 	p = t*(pS0+t*(pS1+t*(pS2+t*(pS3+t*(pS4+t*pS5)))));
>>> -	q = one+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
>>> +	q = 1+t*(qS1+t*(qS2+t*(qS3+t*qS4)));
>>
>> This still uses a slow rational function approximation and evaluates it
>> slowly using Horner's method.  I have only fixed this in my float version.
>> The polynomial has terms out to S11, and would need too many terms for
>> long doubles.  But so does the rational function.
>
> I'll get to that problem later.  See my sinpi[fl] inplementations.
> One does not need a rational approximation (at least with sinpi).

sin() is fundamentally different.  Its power series of course converges
everywhere, and deeper magic lets it converge fairly fast out to pi/4.
This also allows good polynomial approximations (no worse than some
leading terms of the power series).  Other functions benefit more from
using rational functions.  tan() does, and is actually implemented as
a disguised rational function.

>>> 	s = sqrt(t);
>>> 	if(ix>=0x3FEF3333) { 	/* if |x| > 0.975 */
>>> 	    w = p/q;
>>> -	    t = pio2_hi-(2.0*(s+s*w)-pio2_lo);
>>> +	    t = pio2_hi-(2*(s+s*w)-pio2_lo);
>>> 	} else {
>>> 	    w  = s;
>>> 	    SET_LOW_WORD(w,0);
>>> 	    c  = (t-w*w)/(s+w);
>>> 	    r  = p/q;
>>
>> The rational function is not all that slow, since there is a sqrt() as
>> well as a division in some cases, but the division limits accuracy.  So
>> does the sqrt().  The float version cheats by using double for sqrt().
>
> If you look at my patch, you'll see that the sqrt() in the
> float version has been replaced with sqrtf().  The float version
> cheats by using double in a few places.  I think I know how to
> fix that issue.

It's difficult without losing accuracy or efficiency.  That is why we
switched to using sqrt().

My accuracy tests:

float precision:
acos:  max_er = 0x1cbc92d0 0.8980, avg_er = 0.170, #>=1:0.5 = 0:5422147 (old)
acos:  max_er = 0x1cc99656 0.8996, avg_er = 0.170, #>=1:0.5 = 0:5584360
asin:  max_er = 0x1d9cc0b9 0.9254, avg_er = 0.013, #>=1:0.5 = 0:2357938 (old)
asin:  max_er = 0x1619e27c 0.6907, avg_er = 0.012, #>=1:0.5 = 0:3315992
atan:  max_er = 0x1b44773c 0.8521, avg_er = 0.186, #>=1:0.5 = 0:6406812 (old)
atan:  max_er = 0x1b3b6262 0.8510, avg_er = 0.186, #>=1:0.5 = 0:6759430

double precision:
acos:  max_er =      0x708 0.8789, avg_er = 0.137, #>=1:0.5 = 0:766994
asin:  max_er =      0x6ca 0.8486, avg_er = 0.003, #>=1:0.5 = 0:354026
atan:  max_er =      0x618 0.7617, avg_er = 0.140, #>=1:0.5 = 0:1142804

Here '(old)' is the committed version and the other float precision one
is the development version (5-10 years old).  The development version
doesn't use rational functions (except possibly disguised).  It is
slightly less accurate, except for asinf() it is much more accurate.
asinf() still uses sqrt(), but acosf() always used sqrt().

Note that the double precision versions are accurate enough despite not
using sqrtl().  I think they sacrifice efficiency to be so accurate.

> I also think that sqrt[fl] can be replaced by an in-line specialized
> sqrt to avoid the overhead of sqrt[fl].

No way.  sqrt[fl] is already inlined by the compiler at -O1 on most or
all arches.  It is also in libm as a function, but the function is
rarely called.  The C versions are too slow to be usable.  On Haswell
with gcc-3.3:

fdlsqrtf: cycles per call: 101-168
fdlsqrt: cycles per call: 596-680
fdlsqrtl: cycles per call: 1119-1121

Most of the time is for perfect rounding which is not needed.  Take that
out and the software sqrtl might be only 10 times slower than the hardware
version instead of 50 times slower.  It would be about as slow as cbrtl:

fdlcbrtf: cycles per call: 38-38
fdlcbrt: cycles per call: 166-166
fdlcbrtl: cycles per call: 77-81

This is with my optimized cbrtl (also cbrtf?) and committed cbrt.
Haswell handles standard optimizations especially well but does badly
with the old method in cbrt.  Someone in comp.arch optimized cbrtl
much more using methods that are too unportable for me.  It is possible
to get cbrt down to 20-30 cycles on Ivy Bridge.  I can only do this for
cbrtf using more portable methods and I still need SSE1.

Bruce