From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 02:06:31 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 4D0D0106564A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 02:06:31 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 07CB78FC12
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 02:06:30 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8G26S1B091225; Sat, 15 Sep 2012 21:06:29 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <50553424.2080902@missouri.edu>
Date: Sat, 15 Sep 2012 21:06:28 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu>
	<20120814003614.H3692@besplex.bde.org>
	<50295F5C.6010800@missouri.edu>
	<20120814072946.S5260@besplex.bde.org>
	<50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
In-Reply-To: <20120916041132.D6344@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 02:06:31 -0000

One more thing I would like an opinion on.

In my code I check for |z| being small, and then use the approximations:
casinh(z) = z
cacos(z) = Pi - z
catanh(z) = z

However these approximations are not used in the papers by Hull et al, 
and the code works just fine if I don't include these in the code.

The only reason I put this code in is because I thought it would go a 
little faster in the cases that |z| is small.  Checking |z| is small 
takes no time at all.

So what do you think?  Should I keep these in the code or not?

Thanks, Stephen


From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 04:42:19 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CC31D106564A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 04:42:18 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 4AA558FC08
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 04:42:17 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8G4gGmn001583 for <freebsd-numerics@freebsd.org>;
	Sat, 15 Sep 2012 23:42:17 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505558A8.6040600@missouri.edu>
Date: Sat, 15 Sep 2012 23:42:16 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: freebsd-numerics@freebsd.org
References: <5017111E.6060003@missouri.edu> <50295F5C.6010800@missouri.edu>
	<20120814072946.S5260@besplex.bde.org>
	<50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
In-Reply-To: <50553424.2080902@missouri.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 04:42:19 -0000

Hey guys - I have a piece of code like this:

if (ax < DBL_EPSILON && ay < DBL_EPSILON)
	if ((int)ax==0 && (int)ay==0) { /* raise inexact */
		if (sy == 0)
			return (cpack(m_pi_2 - x, copysign(ay, -1)));
		return (cpack(m_pi_2 - x, ay));
	}

Is there a good reason I didn't code it like this?

if (ax < DBL_EPSILON && ay < DBL_EPSILON)
	if ((int)ax==0 && (int)ay==0) /* raise inexact */
		return (cpack(m_pi_2 - x, -y));


I'm trying to remember if I coded it the second way, and one of you told 
me to code it the first way.  Or maybe I came up with the first way 
myself - maybe I wasn't sure what would happen if y was 0 or -0.

Thanks, Stephen

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 05:14:54 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id BC6BA1065672
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 05:14:54 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au
	[211.29.132.188])
	by mx1.freebsd.org (Postfix) with ESMTP id 34BF58FC08
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 05:14:53 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8G5EoHD024205
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 16 Sep 2012 15:14:52 +1000
Date: Sun, 16 Sep 2012 15:14:50 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <50553424.2080902@missouri.edu>
Message-ID: <20120916134730.Y957@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120814072946.S5260@besplex.bde.org>
	<50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu>
	<504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 05:14:55 -0000

On Sat, 15 Sep 2012, Stephen Montgomery-Smith wrote:

> One more thing I would like an opinion on.
>
> In my code I check for |z| being small, and then use the approximations:
> casinh(z) = z
> cacos(z) = Pi - z

Actually Pi/2 - z.

> catanh(z) = z
>
> However these approximations are not used in the papers by Hull et al, and 
> the code works just fine if I don't include these in the code.

Probably a bug in the papers.

casinh(z) = formula(z) would probably spuriously underflow for small z.
Avoiding underflow in the formula would probably reduce to returning z
with special code to raise inexact.  A formula like casinh(z) = z - z**3/6
would raise inexact as a side effect, at least for the same small z that
it underflows, but would also raise underflow and possibly denormal.  A
formula like z * (1 - z**2/6) would avoid underflow in more cases but
would probably be slower and less accurate when both are valid.

cacos(z) = Pi/2 - z - z**3/6 should be parenthesized as Pi/2 - (z + z**3/6)
for accuracy.  This gives the same underflow problem for the parenthesized
part.  You should actually use more like Pi/2 than Pi/2 - z.  See below.

The corresponding real functions in fdlibm of course use the approximations,
with a threshold higher than need to avoid underflow so as as to get a
free optimization when the approximation applies.

Similarly for any function represented by a power series about 0:

     f(z) = <zero terms> + f<nth deriv>(z) * z**n/n! + o(z**n)

The first term would underflow for small z.  When the first term is a
nonzero constant, it won't underflow, but higher terms would, and
higher terms should be added to each other first for accuracy..  For
expansion about z0 != 0, (z - z0) won't underflow, but the first term
involving it might.

C didn't support FP exceptions when the paper was written, but fdlibm did.

I just noticed minor bufs and pessimizations in your code for Pi/2 - z:
from cacos():

% 	if (ax < DBL_EPSILON && ay < DBL_EPSILON)
% 		if ((int)ax==0 && (int)ay==0) { /* raise inexact */
% 			if (sy == 0)
% 				return (cpack(m_pi_2 - x, copysign(ay, -1)));
% 			return (cpack(m_pi_2 - x, ay));
% 		}

(1) The real result of m_pi_2 is inexact even when z = 0, so inexact should
     be raised in all cases and the tricky extra code to avoid setting it
     when z = 0 is just a bug.

(2) At least if ax is a little smaller than DBL_EPSILON and the rounding mode
     is to nearest, m_pi_2 - x is just m_pi_2.  I think subtracting x raises
     underflow, but inexact is already raised for x != 0 in another way.

(3) The other way is slower, so subtracting x should be preferred.

(4) The corresponding fdlibm code for real acos() essentially adds a
     constant to Pi/2 instead of x.  It is 'return pio2_hi + pio2_lo;'
     where pio2_lo is volatile so that the addition is hopefully done
     at run time.  This gives subtle differences in the result in nonstandard
     rounding modes.  Mostly we don't support nonstandard rounding modes,
     but this method is better for them.  Your method is sensitive to the
     sign of x, but should not be.  With perfect rounding in all modes,
     the result should be Pi/2 rounded according to the mode, and not
     depend on the sign of x.  I don't know if the fdlibm constants are
     magic enough for this to work in all modes.  Normally, a 'hi' term
     is the result rounded to nearest in the ambient precision, and the
     'lo' term is the residual (rounded to nearest...), but here we
     want the final rounding to depend on the mode and it isn't clear
     that this can be expressed with a pair of constants each rounded in
     a single mode.

(5) fdlibm real acos() uses a threshold of DBL_MIN / 32 for returning
     pio2_hi + pio2_lo, I think just because it isn't clear where the
     exact threshold for this approximation being valid is.  The general
     formula works (doesn't underflow) for DBL_MAX / 32 <= |x| < 0.5,
     since it is a rational approximation written in a form that doesn't
     involve any terms smaller than Const*x**2.  Raising x to a higher
     power requires more care.  I happen to have rewritten this
     approximation in the float case to use a polynomial written more
     efficiently using higher powers, and just noticed that I wasn't
     careful enough.  I have an 11th power, and in float precision
     the threshold is 2**-26 and raising 2**-26 to just the 5th power
     underflows in float precision.

Complex acos() still has to avoid underflow in in the code following
the above when only one of ax and ay is small, so perhaps a special
case for this isn't actually optimal.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 08:23:22 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id C51F8106564A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 08:23:22 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail30.syd.optusnet.com.au (mail30.syd.optusnet.com.au
	[211.29.133.193])
	by mx1.freebsd.org (Postfix) with ESMTP id 3CA538FC0A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 08:23:21 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail30.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8G8N64n000482
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 16 Sep 2012 18:23:08 +1000
Date: Sun, 16 Sep 2012 18:23:06 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505558A8.6040600@missouri.edu>
Message-ID: <20120916174306.H1527@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <50297CA5.5010900@missouri.edu>
	<50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu>
	<5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<505558A8.6040600@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 08:23:23 -0000

On Sat, 15 Sep 2012, Stephen Montgomery-Smith wrote:

> Hey guys - I have a piece of code like this:
>
> if (ax < DBL_EPSILON && ay < DBL_EPSILON)
> 	if ((int)ax==0 && (int)ay==0) { /* raise inexact */
> 		if (sy == 0)
> 			return (cpack(m_pi_2 - x, copysign(ay, -1)));
> 		return (cpack(m_pi_2 - x, ay));
> 	}
>
> Is there a good reason I didn't code it like this?
>
> if (ax < DBL_EPSILON && ay < DBL_EPSILON)
> 	if ((int)ax==0 && (int)ay==0) /* raise inexact */
> 		return (cpack(m_pi_2 - x, -y));
>
>
> I'm trying to remember if I coded it the second way, and one of you told me 
> to code it the first way.  Or maybe I came up with the first way myself - 
> maybe I wasn't sure what would happen if y was 0 or -0.

I can only think of [fear of] -y not working right on +-0.

Combined with previous opttimizations and fixes, this gives:

 	if (ax < DBL_EPSILON && ay < DBL_EPSILON)
 		return (cpack(m_pi_2 + tiny, -y)); /* PI/2 with inexact...*/

cacos(0 + I*NaN) and several cases for catanh() should similarly add to
m_pi_2 to raise inexact when they return a part with an inexact PI/2.
Otherwise, catrig*.c is remarkably careful about raising inexact.

Refinement: be more careful with the rounding direction (as in fdlibm?):
(1) make sure that m_pi_2 is PI/2 rounded down for the above use (but
     round to nearest for other uses).  Or maybe, if rounding to nearest
     happens to round up, use m_pi_2 - tiny instead of m_pi_2 + tiny so
     that the runtime rounding goes in the right direction in hopefully
     all rounding modes.
(2) add (or subtract) more than `tiny' to m_pi_2 if necessary to bump it
     to the correct side of the infinite-precision PI/2, so that the runtime
     rounding goes in the right direction.  I'm not sure if this is necessary
     or even possible.

Copying the values of PI/2 from the real functions should give both of these,
to the same extent that it gives them for the real functions.  The spelling
of the variables should be copied too.  The latter is pio2_hi + pio2_lo.
Using pio2_lo instead od `tiny' may be unnecessary and pessimal.  pio2_lo
is declared volatile so that it is runtime here, but it is also used in
code where it doesn't need to be volatile.  The real functions don't have
a `tiny' variable, and just re-use the general pio2_lo to get inexact here.
So it looks like (2) is unnecessary, with the real functions using pio2_lo
just because it is good enough.

Note that when you need to control the rounding direction or just have
a hi+lo decomposition, it is critical that the constant for the hi
part have a particular value in binary.  When it is declared in decimal,
the decimal value should be rounded to match the desired binary value,
so its higher digits will be quite different from the ones of the
infinite- precision full value, even when the hi value is the best
approximation to the full value (and doesn't have bits in it killed
for technical reasons).  I noticed this in the opposite direction when
I calculated the decimal and binary values to put in the constant
tables recently.  Normally I round in binary and then print the rounded
value in decimal.  This looked strange for m_e, so I switched to
printing a value with it rounded in decimal, with the binary rounding
only in a comment.  The strangeness is largest when there are many
extra guard digits in the decimal value, like you had originally.  It
is unclear whether these digits should match the infinite-precision
value or the expected rounded binary value.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 15:13:46 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DECFE106564A
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 15:13:46 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 984BE8FC0C
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 15:13:46 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8GFDida042643; Sun, 16 Sep 2012 10:13:45 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <5055ECA8.2080008@missouri.edu>
Date: Sun, 16 Sep 2012 10:13:44 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu>
	<20120814072946.S5260@besplex.bde.org>
	<50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
In-Reply-To: <20120916134730.Y957@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 15:13:47 -0000

On 09/16/2012 12:14 AM, Bruce Evans wrote:
> On Sat, 15 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> One more thing I would like an opinion on.
>>
>> In my code I check for |z| being small, and then use the approximations:
>> casinh(z) = z
>> cacos(z) = Pi - z
>
> Actually Pi/2 - z.
>
>> catanh(z) = z
>>
>> However these approximations are not used in the papers by Hull et al,
>> and the code works just fine if I don't include these in the code.
>
> Probably a bug in the papers.

It is not a bug in the papers.  The algorithms they provide really do 
work when |z| is small.  In fact, you have to deal separately with the 
cases |x| is small and |y| is small (z=x+I*y), so dealing with both of 
them being small is not any additional problem.

And now I see your other post, that using PI/2 is problematic especially 
when rounding is not to nearest.  (Then the problem of rounding PI/2 
properly is relegated to the acos function, and so it is someone else's 
problem.)

So all things being said and done, I am going to remove the use of these 
approximations.

(And also, my comments describing them had a silly mistake in them as well.)

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 15:20:21 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 43827106574B
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 15:20:21 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id F17F08FC17
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 15:20:20 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8GFKJ0S043146 for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 10:20:20 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <5055EE33.2090400@missouri.edu>
Date: Sun, 16 Sep 2012 10:20:19 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: freebsd-numerics@freebsd.org
References: <5017111E.6060003@missouri.edu> <50297CA5.5010900@missouri.edu>
	<50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
In-Reply-To: <5055ECA8.2080008@missouri.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 15:20:21 -0000

A style question: do you mind this

if (sy==0) ry = copysign(ry, -1);
if (A < 1) A = 1;

or do you prefer

if (sy==0)
	ry = copysign(ry, -1);
if (A < 1)
	A = 1;


From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 16:51:29 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 93594106566B
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 16:51:29 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail30.syd.optusnet.com.au (mail30.syd.optusnet.com.au
	[211.29.133.193])
	by mx1.freebsd.org (Postfix) with ESMTP id 219218FC16
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 16:51:28 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail30.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8GGpO9J023821
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 17 Sep 2012 02:51:25 +1000
Date: Mon, 17 Sep 2012 02:51:24 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <5055ECA8.2080008@missouri.edu>
Message-ID: <20120917022614.R2943@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu>
	<504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 16:51:29 -0000

On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:

> On 09/16/2012 12:14 AM, Bruce Evans wrote:
>> On Sat, 15 Sep 2012, Stephen Montgomery-Smith wrote:
>> 
>>> One more thing I would like an opinion on.
>>> 
>>> In my code I check for |z| being small, and then use the approximations:
>>> casinh(z) = z
>>> cacos(z) = Pi - z
>> 
>> Actually Pi/2 - z.
>> 
>>> catanh(z) = z
>>> 
>>> However these approximations are not used in the papers by Hull et al,
>>> and the code works just fine if I don't include these in the code.
>> 
>> Probably a bug in the papers.
>
> It is not a bug in the papers.  The algorithms they provide really do work 
> when |z| is small.  In fact, you have to deal separately with the cases |x| 
> is small and |y| is small (z=x+I*y), so dealing with both of them being small 
> is not any additional problem.
>
> And now I see your other post, that using PI/2 is problematic especially when 
> rounding is not to nearest.  (Then the problem of rounding PI/2 properly is 
> relegated to the acos function, and so it is someone else's problem.)
>
> So all things being said and done, I am going to remove the use of these 
> approximations.

I don't like that.  It will be much slower on almost 1/4 of arg space.
The only reason to consider not doing it is that the args that it
applies to are not very likely, and optimizing for them may pessimize
the usual case.

I just found a related optimization for atan2().  For x > 0 and
|y|/x < 2**-(MANT_DIG+afew), atan2(y, x) is evaluated as essentially
sign(y) * atan(|y|/x).  But in this case, its value is simply y/x
with inexact.  Again the optimization applies to almost 1/4 of arg
space.  It gains more than the normal overhead of an atan() call by
avoiding secondary underflows when y/x underflows.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 17:12:15 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 67E6A106564A
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 17:12:15 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail03.syd.optusnet.com.au (mail03.syd.optusnet.com.au
	[211.29.132.184])
	by mx1.freebsd.org (Postfix) with ESMTP id E908D8FC0C
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 17:12:14 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail03.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8GHC7nQ019965
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 17 Sep 2012 03:12:07 +1000
Date: Mon, 17 Sep 2012 03:12:07 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <5055EE33.2090400@missouri.edu>
Message-ID: <20120917025148.X2943@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu>
	<5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<5055EE33.2090400@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 17:12:15 -0000

On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:

> A style question: do you mind this
>
> if (sy==0) ry = copysign(ry, -1);
> if (A < 1) A = 1;
>
> or do you prefer
>
> if (sy==0)
> 	ry = copysign(ry, -1);
> if (A < 1)
> 	A = 1;

Multiple statements per line are large style bugs, as are missing spaces
around == operators (I might agree only to omitting spaces around most
multiplication operators and some addition operators).

Apart from being less readable, multiple statements per line break debugging
using line-based debuggers.

BTW, copysign() is builtin in gcc-4.2 and not broken by a macro in <math.h>.
Otherwise it would be very slow.

BTW2, fdlibm avoids using copysign() internally, but often sets sign
bits by a direct bit access which does the equivalent of what copysign()
does semantically.  This can be slow, since it best it takes a
read-modify-write of the target with all 3 steps in this non-
parallelizable.  Another not so good way to set sign bits is use an
array with enties +-1 and do `ry *= array[sy];'  Branchy code for
setting or clearing the sign bit may be better then either of these
methods, at least if the branches are predictable.  If the builtin is
very smart, then it will treat the copysign() call as a hint and
select the best alternative, and it can do this more easily than it
can rewrite manually optimized sequences for setting sign bits.  I
think it is not very smart.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 18:26:50 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 75BC01065820
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 18:26:50 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 2EFF38FC18
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 18:26:49 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8GIQl1D055245; Sun, 16 Sep 2012 13:26:48 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505619E7.8080804@missouri.edu>
Date: Sun, 16 Sep 2012 13:26:47 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <5055EE33.2090400@missouri.edu>
	<20120917025148.X2943@besplex.bde.org>
In-Reply-To: <20120917025148.X2943@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 18:26:50 -0000

On 09/16/2012 12:12 PM, Bruce Evans wrote:
> On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> A style question: do you mind this
>>
>> if (sy==0) ry = copysign(ry, -1);
>> if (A < 1) A = 1;
>>
>> or do you prefer
>>
>> if (sy==0)
>>     ry = copysign(ry, -1);
>> if (A < 1)
>>     A = 1;
>
> Multiple statements per line are large style bugs, as are missing spaces
> around == operators (I might agree only to omitting spaces around most
> multiplication operators and some addition operators).
>
> Apart from being less readable, multiple statements per line break
> debugging
> using line-based debuggers.
>
> BTW, copysign() is builtin in gcc-4.2 and not broken by a macro in
> <math.h>.
> Otherwise it would be very slow.

I changed it to:

if (sy==0)
	ry = -ry;

I happen to know that ry is always positive.

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 19:01:40 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 8AC1F106566C
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 19:01:40 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 486608FC17
	for <freebsd-numerics@FreeBSD.org>;
	Sun, 16 Sep 2012 19:01:40 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8GJ1c7N057483; Sun, 16 Sep 2012 14:01:39 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <50562213.9020400@missouri.edu>
Date: Sun, 16 Sep 2012 14:01:39 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
In-Reply-To: <20120917022614.R2943@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 19:01:40 -0000

On 09/16/2012 11:51 AM, Bruce Evans wrote:

>
> I don't like that.  It will be much slower on almost 1/4 of arg space.
> The only reason to consider not doing it is that the args that it
> applies to are not very likely, and optimizing for them may pessimize
> the usual case.

The pessimization when |z| is not small is tiny.  It takes no time at 
all to check that |z| is small.

On the other hand let me go through the code and see what happens when 
|x| is small or |y| is small.  There are actually specific formulas that 
work well in these two cases, and they are probably not that much slower 
than the formulas I decided to remove.  And when you chase through all 
the logic and "if" statements, you may find that you didn't use up a 
whole bunch of time for these very special cases of |z| small - most of 
the extra time merely being the decisions invoked by the "if" statements.

> I just found a related optimization for atan2().  For x > 0 and
> |y|/x < 2**-(MANT_DIG+afew), atan2(y, x) is evaluated as essentially
> sign(y) * atan(|y|/x).  But in this case, its value is simply y/x
> with inexact.  Again the optimization applies to almost 1/4 of arg
> space.  It gains more than the normal overhead of an atan() call by
> avoiding secondary underflows when y/x underflows.

You see, that is exactly where I don't want to do special optimization 
in my code.  In my opinion, it is the tan function itself that should 
realize that |y|/x is small, and hence it is that function that simply 
return |y|/x.  Or if you want to implement it at a higher level, atan2 
should make this realization, and simply return y/x.

Similarly, I would expect log1p(x) to simply return x (inexactly) for x 
small.  And if the compiler is really good, I would hope that the two codes:
log1p(x);
(fabs(x) < DBL_EPSILON) ? x + set_tiny() : log1p(x);
would be equivalent.  (But I am rather sure that gcc isn't that good.)

Furthermore, casinh etc are not commonly used functions.  Putting huge 
amounts of effort looking at special cases to speed it up a little 
somehow feels wrong to me.  In fact, if the programmer knows that he 
will be wanting casinh, and evaluated very fast, then he should be 
motivated enough to try out using z in the case when |z| is small, and 
see if that really speeds things up.

Stephen


From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 19:53:27 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 73817106564A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 19:53:27 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au
	[211.29.132.183])
	by mx1.freebsd.org (Postfix) with ESMTP id E9EDC8FC0A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 19:53:26 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8GJrIne020374
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 17 Sep 2012 05:53:19 +1000
Date: Mon, 17 Sep 2012 05:53:18 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
In-Reply-To: <20120917022614.R2943@besplex.bde.org>
Message-ID: <20120917041848.F3504@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu>
	<5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Stephen Montgomery-Smith <stephen@missouri.edu>,
	freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 19:53:27 -0000

On Mon, 17 Sep 2012, Bruce Evans wrote:

> On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> On 09/16/2012 12:14 AM, Bruce Evans wrote:
>>> On Sat, 15 Sep 2012, Stephen Montgomery-Smith wrote:
>>> 
>>>> One more thing I would like an opinion on.
>>>> 
>>>> In my code I check for |z| being small, and then use the approximations:
>>>> casinh(z) = z
>>>> cacos(z) = Pi - z
>>> 
>>> Actually Pi/2 - z.
>>> 
>>>> catanh(z) = z
>>>> 
>>>> However these approximations are not used in the papers by Hull et al,
>>>> and the code works just fine if I don't include these in the code.
>>> 
>>> Probably a bug in the papers.
>> 
>> It is not a bug in the papers.  The algorithms they provide really do work 
>> when |z| is small.  In fact, you have to deal separately with the cases |x| 
>> is small and |y| is small (z=x+I*y), so dealing with both of them being 
>> small is not any additional problem.
>> 
>> And now I see your other post, that using PI/2 is problematic especially 
>> when rounding is not to nearest.  (Then the problem of rounding PI/2 
>> properly is relegated to the acos function, and so it is someone else's 
>> problem.)
>> 
>> So all things being said and done, I am going to remove the use of these 
>> approximations.
>
> I don't like that.  It will be much slower on almost 1/4 of arg space.
> The only reason to consider not doing it is that the args that it
> applies to are not very likely, and optimizing for them may pessimize
> the usual case.

It gives the expected pessimizations, and unexpected accuracy improvements
and unimprovements.  On amd64:

% 9,10c9,10
% < rcacos:max_er = 0x6947ecac 3.2900, avg_er = 0.317, #>=1:0.5 = 30489:303638
% <         5.47 real         5.47 user         0.00 sys
% ---
% > rcacos:max_er = 0x6947ecac 3.2900, avg_er = 0.317, #>=1:0.5 = 30489:268862
% >         5.87 real         5.86 user         0.00 sys

Only float functions were updated for this test, and results are only
shown for float functions (comparing them with double functions).
'<' in the diff is for an old result and '>' for a new result.  The above
shows:

- accuracy improvement.  Apparently the thresholds were too large.
- slowdown of 0.39 seconds.  The test program mainly calls cacos()
   and cacosf(), but has some overheads.  Say 1 1.47 seconds for the
   overheads and 2 seconds for each of the functions.  0.39/2 is
   almost 20%.

% 21,22c21,22
% < rcacosh:max_er = 0x51e70742 2.5595, avg_er = 0.257, #>=1:0.5 = 25766:3286888
% <         5.81 real         5.79 user         0.00 sys
% ---
% > rcacosh:max_er = 0x51e70742 2.5595, avg_er = 0.258, #>=1:0.5 = 26034:3313256
% >         6.06 real         6.05 user         0.00 sys

Similar slowdown, but now the old version os more accurate.  Apparently
the general code doesn't reduce to simply Pi/2.

% 34c34
% <         5.98 real         5.98 user         0.00 sys
% ---
% >         6.30 real         6.28 user         0.00 sys

This is for rcasin.  Similar slowdown, but no change in values.

% 45,46c45,46
% < rcasinh:max_er = 0x51e70742 2.5595, avg_er = 0.257, #>=1:0.5 = 25766:3286888
% <         5.57 real         5.56 user         0.00 sys
% ---
% > rcasinh:max_er = 0x51e70742 2.5595, avg_er = 0.258, #>=1:0.5 = 26034:3313256
% >         5.82 real         5.81 user         0.00 sys

Like rcacosh (lose speed and accuracy).

% 57,58c57,58
% < rcatan:max_er = 0x51d7c47a 2.5576, avg_er = 0.295, #>=1:0.5 = 77670:443246
% <         3.69 real         3.68 user         0.00 sys
% ---
% > rcatan:max_er = 0x51d7c47a 2.5576, avg_er = 0.296, #>=1:0.5 = 77874:469678
% >         3.64 real         3.64 user         0.00 sys

No slowdown, but accuracy loss.

% 69,70c69,70
% < rcatanh:max_er = 0x5304b263 2.5943, avg_er = 0.201, #>=1:0.5 = 185298:1337156
% <         3.88 real         3.86 user         0.00 sys
% ---
% > rcatanh:max_er = 0x5304b263 2.5943, avg_er = 0.203, #>=1:0.5 = 204986:1370276
% >         3.84 real         3.83 user         0.00 sys

Like rcatan, but the accuracy loss is smaller.

% [... unrelated functions]

% [... imaginary parts have similar behaviour (various symmetries)]

On i386:

% 9,10c9,10
% < rcacos:max_er = 0x4517ee94 2.1592, avg_er = 0.315, #>=1:0.5 = 4607:246852
% <         8.18 real         7.48 user         0.02 sys
% ---
% > rcacos:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 4607:212076
% >         8.48 real         7.82 user         0.03 sys
% ...
% 201,202c201,202
% < icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.315, #>=1:0.5 = 4607:246852
% <         8.62 real         8.50 user         0.01 sys
% ---
% > icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 4607:212076
% >         9.88 real         9.05 user         0.03 sys

Similar slowdowns (only look at the user time), but only changes in
accuracy for these two.  Now the accuracy is never reduced.  I think
you just have to reduce the threshold a little in the old version to
get this improvement.  Indeed, the following works well for me (only
edited the float version):

@ --- /home/stephen/public_html/catrigf.c	2012-09-16 15:14:05.000000000 +0000
@ +++ catrigf.c	2012-09-16 19:23:18.559723000 +0000
@ @@ -165,4 +165,9 @@
@  		}
@ 
@ +	/* XXX the numbers are related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < 2048 * FLT_EPSILON && ay < 2048 *FLT_EPSILON)
@ +		if ((int)ax==0 && (int)ay==0)
@ +			return (z);
@ +
@  	do_hard_work(ax, ay, &rx, &B_is_usable, &B, &sqrt_A2my2, &new_y);
@  	if (B_is_usable)
@ @@ -200,5 +205,6 @@
@  		if (isinf(y))
@  			return (cpackf(x+x, -y));
@ -		if (x == 0) return (cpackf(m_pi_2, y+y));
@ +		if (x == 0)
@ +			return (cpackf(m_pi_2+tiny, y+y));
@  		return (cpackf(x+0.0L+(y+0), x+0.0L+(y+0)));
@  	}

Also fix NaN cases.

@ @@ -214,4 +220,8 @@
@  		}
@ 
@ +	/* XXX the number for ay is related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < FLT_EPSILON / 8 && ay < 2048 * FLT_EPSILON)
@ +		return (cpackf(m_pi_2 + tiny, -y));
@ +
@  	do_hard_work(ay, ax, &ry, &B_is_usable, &B, &sqrt_A2mx2, &new_x);
@  	if (B_is_usable) {

The thresholds should be asymmetric since the real part uses the
approximation Pi/2 + O(x) while the imaginary part uses the approximation
-y + O(y**3).  If we used the approximation Pi/2 - x for the real part
as before, then the thresholds could be more symmetric and more cases
could be handled here.  But then the expression with m_pi_2 would be
(m_pi_2 - x) again, so it wouldn't necessarily set inexact, and the
special code for setting inexact would be needed again.  The old
thresholds were too conservative.  I'm being sloppy with the xy product
terms and nonstandard rounding modes.

The magic numbers were whatever minimised the number of incorrectly
rounded cases in my tests.  sqrt(6 * FLT_EPSILON) is obviously not
conservative enough.  The divisor of 8 gives 3 guard digits which would
fix some cases (iff the cases go through here).  The magic 2048 gives
about 4.5 guard "digits".  A couple more guard digits make little
difference, by 1 fewer gives observably more errors.

@ @@ -313,5 +323,6 @@
@  			return (cpackf(copysignf(0, x), y+y));
@  		if (isinf(y))
@ -			return (cpackf(copysignf(0, x), copysignf(m_pi_2, y)));
@ +			return (cpackf(copysignf(0, x),
@ +			    copysignf(m_pi_2 + tiny, y)));
@  		if (x == 0)
@  			return (cpackf(x, y+y));
@ @@ -320,9 +331,16 @@
@ 
@  	if (isinf(x) || isinf(y))
@ -		return (cpackf(copysignf(0, x), copysignf(m_pi_2, y)));
@ +		return (cpackf(copysignf(0, x), copysignf(m_pi_2 + tiny, y)));
@ 
@  	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
@  		if ((int)(1+tiny)==1)
@ -			return (cpackf(copysignf(real_part_reciprocal(ax, ay), x), copysignf(m_pi_2, y)));
@ +			return (cpackf(
@ +			    copysignf(real_part_reciprocal(ax, ay), x),
@ +			    copysignf(m_pi_2 + tiny, y)));
@ +
@ +	/* XXX the numbers are related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < 2048 * FLT_EPSILON && ay < 2048 * FLT_EPSILON)
@ +		if ((int)ax==0 && (int)ay==0)
@ +			return (z);
@ 
@  	if (ax == 1 && ay < FLT_EPSILON) {

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 20:29:30 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id D4EDC106564A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 20:29:30 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au
	[211.29.132.188])
	by mx1.freebsd.org (Postfix) with ESMTP id 632E68FC08
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 20:29:29 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8GKTK4M019234
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 17 Sep 2012 06:29:21 +1000
Date: Mon, 17 Sep 2012 06:29:20 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <50562213.9020400@missouri.edu>
Message-ID: <20120917060116.G3825@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu>
	<504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 20:29:30 -0000

On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:

> On 09/16/2012 11:51 AM, Bruce Evans wrote:
>> 
>> I don't like that.  It will be much slower on almost 1/4 of arg space.
>> The only reason to consider not doing it is that the args that it
>> applies to are not very likely, and optimizing for them may pessimize
>> the usual case.
>
> The pessimization when |z| is not small is tiny.  It takes no time at all to 
> check that |z| is small.

Not necessarily on out-of-order machines (most x86).  The CPU executes
multiple paths speculatively and concurrently.  If it does more on an
unused path, then it might do less on the used path.  It may mispredict
the branch on the size of |z| and thus misguess which path to do more
on.  (I don't know many details of this.  For example, does it do
anything at all on paths predicted to be not taken?)  Losses from this
are usually described as branch mispredictions.  They might cost 20
(50? 100?) cycles after taking 2 about cycles to actually check |z|
(2 cycles pipelined but more like <length of pipe> + 8 in real time,
and it is the latter time that you lose by backing out).

The only sure way to avoid branch mispredictions is to not have any,
and catrig is too complicated for that.

> On the other hand let me go through the code and see what happens when |x| is 
> small or |y| is small.  There are actually specific formulas that work well 
> in these two cases, and they are probably not that much slower than the 
> formulas I decided to remove.  And when you chase through all the logic and 
> "if" statements, you may find that you didn't use up a whole bunch of time 
> for these very special cases of |z| small - most of the extra time merely 
> being the decisions invoked by the "if" statements.

But all general cases end up going through an extern function like
acos() or atan2(), and just calling another function is a significant
overhead.  When |z| is small, the arg(s) to the other function will
probably be an special case for it (e.g., acos(small)).  The other
function should optimize this and not take as long as an average call.
However, since it is special, it may cause branch mispredictions for
other uses of the function.

>> I just found a related optimization for atan2().  For x > 0 and
>> |y|/x < 2**-(MANT_DIG+afew), atan2(y, x) is evaluated as essentially
>> sign(y) * atan(|y|/x).  But in this case, its value is simply y/x
>> with inexact.  Again the optimization applies to almost 1/4 of arg
>> space.  It gains more than the normal overhead of an atan() call by
>> avoiding secondary underflows when y/x underflows.
>
> You see, that is exactly where I don't want to do special optimization in my 
> code.  In my opinion, it is the tan function itself that should realize that 
> |y|/x is small, and hence it is that function that simply return |y|/x.  Or 
> if you want to implement it at a higher level, atan2 should make this 
> realization, and simply return y/x.

I'm thinking of going the other way and using atan(y/x) instead of atan2()
:-).  This is safe iff we know that y/x is not very special.

> Similarly, I would expect log1p(x) to simply return x (inexactly) for x 
> small.  And if the compiler is really good, I would hope that the two codes:
> log1p(x);
> (fabs(x) < DBL_EPSILON) ? x + set_tiny() : log1p(x);
> would be equivalent.  (But I am rather sure that gcc isn't that good.)
>
> Furthermore, casinh etc are not commonly used functions.  Putting huge 
> amounts of effort looking at special cases to speed it up a little somehow 
> feels wrong to me.  In fact, if the programmer knows that he will be wanting 
> casinh, and evaluated very fast, then he should be motivated enough to try 
> out using z in the case when |z| is small, and see if that really speeds 
> things up.

True.  Now I mainly want it to be fast so that I can test more cases.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 20:49:36 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 16064106574A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 20:49:36 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id C490A8FC08
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 20:49:35 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8GKnXtC064408; Sun, 16 Sep 2012 15:49:34 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <50563B5E.3090301@missouri.edu>
Date: Sun, 16 Sep 2012 15:49:34 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
In-Reply-To: <20120917060116.G3825@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 20:49:36 -0000

On 09/16/2012 03:29 PM, Bruce Evans wrote:

> I'm thinking of going the other way and using atan(y/x) instead of atan2()
> :-).  This is safe iff we know that y/x is not very special.

This was, in fact, how it was presented in the original paper.  The 
Boost libraries also used atan instead of atan2.

In fact, when I first heard of the "atan2" function (perhaps way back 
when PL/1 was a programming language), I naively thought that atan(x) 
was implemented as atan2(1,x).


From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 20:53:45 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id F3447106566C
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 20:53:44 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id BC1708FC08
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 20:53:44 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8GKrhpE064673; Sun, 16 Sep 2012 15:53:43 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <50563C57.60806@missouri.edu>
Date: Sun, 16 Sep 2012 15:53:43 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
In-Reply-To: <20120917060116.G3825@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 20:53:45 -0000

On 09/16/2012 03:29 PM, Bruce Evans wrote:
> On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> On 09/16/2012 11:51 AM, Bruce Evans wrote:
>>>
>>> I don't like that.  It will be much slower on almost 1/4 of arg space.
>>> The only reason to consider not doing it is that the args that it
>>> applies to are not very likely, and optimizing for them may pessimize
>>> the usual case.
>>
>> The pessimization when |z| is not small is tiny.  It takes no time at
>> all to check that |z| is small.
>
> Not necessarily on out-of-order machines (most x86).  The CPU executes
> multiple paths speculatively and concurrently.  If it does more on an
> unused path, then it might do less on the used path.  It may mispredict
> the branch on the size of |z| and thus misguess which path to do more
> on.  (I don't know many details of this.  For example, does it do
> anything at all on paths predicted to be not taken?)  Losses from this
> are usually described as branch mispredictions.  They might cost 20
> (50? 100?) cycles after taking 2 about cycles to actually check |z|
> (2 cycles pipelined but more like <length of pipe> + 8 in real time,
> and it is the latter time that you lose by backing out).
>
> The only sure way to avoid branch mispredictions is to not have any,
> and catrig is too complicated for that.

Yes, but I did a time test.  And in my case the test was almost always 
failing.

>
>> On the other hand let me go through the code and see what happens when
>> |x| is small or |y| is small.  There are actually specific formulas
>> that work well in these two cases, and they are probably not that much
>> slower than the formulas I decided to remove.  And when you chase
>> through all the logic and "if" statements, you may find that you
>> didn't use up a whole bunch of time for these very special cases of
>> |z| small - most of the extra time merely being the decisions invoked
>> by the "if" statements.
>
> But all general cases end up going through an extern function like
> acos() or atan2(), and just calling another function is a significant
> overhead.  When |z| is small, the arg(s) to the other function will
> probably be an special case for it (e.g., acos(small)).  The other
> function should optimize this and not take as long as an average call.
> However, since it is special, it may cause branch mispredictions for
> other uses of the function.

I understand what you are saying.  I guess it just seems to me that the 
"proper" way to do it is to make the C compiler really awesome and do 
this for you.  (Doesn't the Intel compiler try to embed functions inline 
if it knows it will speed things up)?

>> Furthermore, casinh etc are not commonly used functions.  Putting huge
>> amounts of effort looking at special cases to speed it up a little
>> somehow feels wrong to me.  In fact, if the programmer knows that he
>> will be wanting casinh, and evaluated very fast, then he should be
>> motivated enough to try out using z in the case when |z| is small, and
>> see if that really speeds things up.

Well, if casinh goes 20% slower, your not going to be testing too many 
fewer cases.

> True.  Now I mainly want it to be fast so that I can test more cases.

I understand.  But putting those special cases into casinh offends my 
sense of taste.


From owner-freebsd-numerics@FreeBSD.ORG  Sun Sep 16 21:00:19 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 9A7E7106564A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 21:00:19 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id F33068FC0A
	for <freebsd-numerics@freebsd.org>;
	Sun, 16 Sep 2012 21:00:06 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8GL05St065122; Sun, 16 Sep 2012 16:00:05 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <50563DD5.4060303@missouri.edu>
Date: Sun, 16 Sep 2012 16:00:05 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<20120917041848.F3504@besplex.b! de.org>
In-Reply-To: <20120917041848.F3504@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Sep 2012 21:00:19 -0000

On 09/16/2012 02:53 PM, Bruce Evans wrote:
> On Mon, 17 Sep 2012, Bruce Evans wrote:
>
>> On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:
>>
>>> On 09/16/2012 12:14 AM, Bruce Evans wrote:
>>>> On Sat, 15 Sep 2012, Stephen Montgomery-Smith wrote:
>>>>
>>>>> One more thing I would like an opinion on.
>>>>>
>>>>> In my code I check for |z| being small, and then use the
>>>>> approximations:
>>>>> casinh(z) = z
>>>>> cacos(z) = Pi - z
>>>>
>>>> Actually Pi/2 - z.
>>>>
>>>>> catanh(z) = z

>>> So all things being said and done, I am going to remove the use of
>>> these approximations.

>
> It gives the expected pessimizations, and unexpected accuracy improvements
> and unimprovements.  On amd64:

I got unexpected accuracy improvements as well!  I thought it might just 
be a coincidence, so I ignored it.

> @ @@ -313,5 +323,6 @@
> @              return (cpackf(copysignf(0, x), y+y));
> @          if (isinf(y))
> @ -            return (cpackf(copysignf(0, x), copysignf(m_pi_2, y)));
> @ +            return (cpackf(copysignf(0, x),
> @ +                copysignf(m_pi_2 + tiny, y)));
> @          if (x == 0)
> @              return (cpackf(x, y+y));
> @ @@ -320,9 +331,16 @@
> @ @      if (isinf(x) || isinf(y))
> @ -        return (cpackf(copysignf(0, x), copysignf(m_pi_2, y)));
> @ +        return (cpackf(copysignf(0, x), copysignf(m_pi_2 + tiny, y)));
> @ @      if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
> @          if ((int)(1+tiny)==1)
> @ -            return (cpackf(copysignf(real_part_reciprocal(ax, ay),
> x), copysignf(m_pi_2, y)));
> @ +            return (cpackf(
> @ +                copysignf(real_part_reciprocal(ax, ay), x),
> @ +                copysignf(m_pi_2 + tiny, y)));
> @ +
> @ +    /* XXX the numbers are related to sqrt(6 * FLT_EPSILON). */
> @ +    if (ax < 2048 * FLT_EPSILON && ay < 2048 * FLT_EPSILON)
> @ +        if ((int)ax==0 && (int)ay==0)
> @ +            return (z);
> @ @      if (ax == 1 && ay < FLT_EPSILON) {

I implemented all the m_pi_2 + tiny changes.

Let me still ponder the |z| being small issue.  Or you can put that code 
back in when it is committed.


From owner-freebsd-numerics@FreeBSD.ORG  Mon Sep 17 17:15:57 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 2D5DE1065670
	for <freebsd-numerics@freebsd.org>;
	Mon, 17 Sep 2012 17:15:57 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au
	[211.29.132.183])
	by mx1.freebsd.org (Postfix) with ESMTP id A52638FC17
	for <freebsd-numerics@freebsd.org>;
	Mon, 17 Sep 2012 17:15:56 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8HH70PM016952
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Tue, 18 Sep 2012 03:15:53 +1000
Date: Tue, 18 Sep 2012 03:07:00 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <50563C57.60806@missouri.edu>
Message-ID: <20120918012459.V5094@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu>
	<504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Sep 2012 17:15:57 -0000

On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:

> On 09/16/2012 03:29 PM, Bruce Evans wrote:
>> On Sun, 16 Sep 2012, Stephen Montgomery-Smith wrote:
>> ...
>> The only sure way to avoid branch mispredictions is to not have any,
>> and catrig is too complicated for that.
>
> Yes, but I did a time test.  And in my case the test was almost always 
> failing.

I test different data, with an over-emphasis on exceptional cases :-).

>>> On the other hand let me go through the code and see what happens when
>>> |x| is small or |y| is small.  There are actually specific formulas
>>> that work well in these two cases, and they are probably not that much
>>> slower than the formulas I decided to remove.  And when you chase

I checked a few cases and didn't see any problems, but noticed some
more things that could be handled by general code, giving the following
minor optimizations (only done for float precision).

>>> through all the logic and "if" statements, you may find that you
>>> didn't use up a whole bunch of time for these very special cases of
>>> |z| small - most of the extra time merely being the decisions invoked
>>> by the "if" statements.

Branch prediction is working very well, but I would prefer not to stress
it unnecessarily.  The data in my tests is also too uniformly ordered to
stress the branch prediction.

@ --- catrigf.c~	2012-09-17 02:05:43.000000000 +0000
@ +++ catrigf.c	2012-09-17 15:21:59.560420000 +0000
@ @@ -157,12 +157,19 @@
@  	}
@ 
@ -	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
@ -		if (isinf(x) || isinf(y) || (int)(1+tiny)==1) {
@ -			if (signbit(x) == 0)
@ -				w = clog_for_large_values(z) + m_ln2;
@ -			else
@ -				w = clog_for_large_values(-z) + m_ln2;
@ -			return (cpackf(copysignf(crealf(w), x), copysignf(cimagf(w), y)));
@ -		}
@ +	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON) {
@ +		/* clog...() will raise inexact unless x or y is infinite */
@ +		if (signbit(x) == 0)
@ +			w = clog_for_large_values(z) + m_ln2;
@ +		else
@ +			w = clog_for_large_values(-z) + m_ln2;
@ +		return (cpackf(copysignf(crealf(w), x), copysignf(cimagf(w), y)));
@ +	}

Trust the general code (clog()) to raise inexact appropriately.

A previous version of this raised inexact by adding `tiny' to w in the
correct order.  realf(w) is large or infinite, so the expression
(realf(w) + tiny + m_ln2) has the same value as (realf(w) + m_ln2) and
raises inexact iff realf(w) != +Inf.  But this addition is unnecessary.

@ +
@ +#if 0
@ +	/* XXX the numbers are related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < 2048 * FLT_EPSILON && ay < 2048 * FLT_EPSILON)
@ +		if ((int)ax==0 && (int)ay==0)
@ +			return (z);
@ +#endif

Previous optimization turned off for debugging.

@ 
@  	do_hard_work(ax, ay, &rx, &B_is_usable, &B, &sqrt_A2my2, &new_y);
@ @@ -205,13 +212,19 @@
@  	}
@ 
@ -	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
@ -		if (isinf(x) || isinf(y) || (int)(1+tiny)==1) {
@ -			w = clog_for_large_values(z);
@ -			rx = fabsf(cimagf(w));
@ -			ry = crealf(w) + m_ln2;
@ -			if (sy == 0)
@ -				ry = -ry;
@ -			return (cpackf(rx, ry));
@ -		}
@ +	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON) {
@ +		/* clog...() will raise inexact unless x or y is infinite */
@ +		w = clog_for_large_values(z);
@ +		rx = fabsf(cimagf(w));
@ +		ry = crealf(w) + m_ln2;
@ +		if (sy == 0)
@ +			ry = -ry;
@ +		return (cpackf(rx, ry));
@ +	}

As above.

@ +
@ +#if 0
@ +	/* XXX the number for ay is related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < FLT_EPSILON / 8 && ay < 2048 * FLT_EPSILON)
@ +		return (cpackf(m_pi_2 + tiny, -y));
@ +#endif

Not quite the previous optimization turned off for debugging.  It now
raises inexact undconditionally by adding tiny to m_pi_2.  This seems
to actually be a minor pessimization, but I prefer it since it takes
less code.  The version using (int)(1+tiny) has the advantage that its
result is not normally used, while the above is often used; the above
does an extra operation in the often-used path.

@ 
@  	do_hard_work(ay, ax, &ry, &B_is_usable, &B, &sqrt_A2mx2, &new_x);
@ @@ -321,30 +334,28 @@
@  	}
@ 
@ -	if (isinf(x) || isinf(y))
@ -		return (cpackf(copysignf(0, x), copysignf(m_pi_2 + tiny, y)));
@ +	/* Raise inexact unless z == 0; return for z == 0 as a side effect. */
@ +	if ((x == 0 && y == 0) || (int)(1 + tiny) != 1)
@ +		return (z);

Larger optimizations only done for catanhf():

First, the above removes the special code for handling infinities.  These
will be handled by the "large" case later.

Second, it raises inexact for the one remaining case (z == 0) where the
result is exact (all the other exact cases involve NaNs.  Note that cases
involving Infs return m_pi_2 for the imaginary part, so they are never
exact).

This patch doesn't show the removal of the code for raising inexact in
sum_squares().

cacos*() and casin*() should benefit even more from an up-front raising
of inexact, since do_hard_work() has 7 magic statements to raise inexact
where sum_squares has only 1.

@ 
@  	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
@ -		if ((int)(1+tiny)==1)
@ -			return (cpackf(copysignf(real_part_reciprocal(ax, ay), x), copysignf(m_pi_2, y)));
@ +		return (cpackf(copysignf(real_part_reciprocal(ax, ay), x), copysignf(m_pi_2, y)));

Depend on inexact being raised up-front.

There are no magic expressions (int)(1+tiny) left except the new up-front
one.  There are still not-so- magic expressions (m_pi_2 + tiny).  BTW,
most or all of the recent fixes to use the latter expressions don't
have a comment about raising inexact in catrig.c, while most or all
older expressions for setting inexact do have such a comment.

A previous version of this optimization raised inexact by adding tiny
to m_pi_2.

@ +
@ +	/* XXX the numbers are related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < 2048 * FLT_EPSILON && ay < 2048 * FLT_EPSILON)
@ +		return (z);

Previous optimization not turned off for debugging.  It is simpler now
that it can depend on inexact being raised up-front.

@ 
@  	if (ax == 1 && ay < FLT_EPSILON) {
@ -		if ((int)ay==0) {
@ -			if ( ilogbf(ay) > FLT_MIN_EXP)
@ -				rx = - logf(ay/2) / 2;
@ -			else
@ -				rx = - (logf(ay) - m_ln2) / 2;
@ -		}
@ +		if (ilogbf(ay) > FLT_MIN_EXP)
@ +			rx = - logf(ay/2) / 2;
@ +		else
@ +			rx = - (logf(ay) - m_ln2) / 2;

Depend on inexact being raised up-front.

A previous version of this optimization depended instead on logf() raising
inexact appropriately (since the arg is never 1, the result is always
inexact).

@  	} else
@  		rx = log1pf(4*ax / sum_squares(ax-1, ay)) / 4;
@ 
@ -	if (ax == 1) {
@ -		if (ay==0)
@ -			ry = 0;
@ -		else
@ -			ry = atan2f(2, -ay) / 2;
@ -	} else if (ay < FOUR_SQRT_MIN) {
@ -		if ((int)ay==0)
@ -			ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
@ -	} else
@ +	if (ax == 1)
@ +		ry = atan2f(2, ay) / 2;
@ +	else if (ay < FOUR_SQRT_MIN)
@ +		ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
@ +	else
@  		ry = atan2f(2*ay, (1-ax)*(1+ax) - ay*ay) / 2;
@

Remove the special case for (ax == 1, ay == 0).  The general case gives
the same result.  The correctness of this probably depends on the sign
considerations for the next change, and pointed to that change (here
it seems to use +0 when y == 0 but -ay otherwise.

Remove negation of ay for ax == 1.  The sign will be copied into the result
later for all cases, so it doesn't matter in the arg.  I didn't check the
branch cut details for this, but runtime tests passed.

Since the sign doesn't matter, we could pass y instead of ay.

I don't understand the threshold of FOUR_SQRT_MIN.  ay*ay starts
underflowing at SQRT_MIN.  FOUR_SQRT_MIN seems to work, and has
efficiency advantages.  But large multiples of FOUR_SQRT_MIN also
seem to work, and have larger efficiency advantages

...  I now understand what the threshold should be.  You have
filtered out ax == 1.  This makes 1 - ax*ax at least ~2*EPSILON, so 
ay*ay can be dropped if ay is less than sqrt(2*EPSILON*EPSILON) *
2**-GUARD_DIGITS = EPSILON * 2**-5 say.  SQRT_MIN is way smaller
than that, so FOUR_SQRT_MIN works too.  We should use a larger
threshold for efficiency, or avoid the special case for ax == 1.
Testing shows that this analysis is off by a factor of about
sqrt(EPSILON), since a threshold of EPSILON * 2**7 is optimal.
The optimization made no difference to speed; it is just an
optimization for understanding.  Maybe the special case for ax == 1
can be avoided, or folded together with the same special case for
evaluation of the real part.  This special case is similar to the
one in clog(), but easier.

Further optimization: in sum_squares(), y is always ay >= 0, so there
is no need to apply fabs*() to it.  I think the compiler does this
optimization.  It can see that y == ay via the inline.

BTW, do_hard_work() is usually not inlined, so the compiler wouldn't
be able to do such optimizations.  However, declaring it as
__always_inline didn't improve the speed.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Mon Sep 17 22:50:28 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id C951F106564A
	for <freebsd-numerics@freebsd.org>;
	Mon, 17 Sep 2012 22:50:28 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 57FDC8FC0A
	for <freebsd-numerics@freebsd.org>;
	Mon, 17 Sep 2012 22:50:28 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8HMoQkw083746; Mon, 17 Sep 2012 17:50:26 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <5057A932.3000603@missouri.edu>
Date: Mon, 17 Sep 2012 17:50:26 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
In-Reply-To: <20120918012459.V5094@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Sep 2012 22:50:28 -0000

OK, I am struggling a bit with the latest suggestions.

First, I have completely removed all the code related to when |z| is 
small.  I have just lost it all.  So I didn't perform any changes 
related to that code.  If you want me to put it back with appropriate 
"#if 0", can you email those code segments back to me?

On 09/17/2012 12:07 PM, Bruce Evans wrote:

> @ @@ -321,30 +334,28 @@
> @      }
> @ @ -    if (isinf(x) || isinf(y))
> @ -        return (cpackf(copysignf(0, x), copysignf(m_pi_2 + tiny, y)));
> @ +    /* Raise inexact unless z == 0; return for z == 0 as a side
> effect. */
> @ +    if ((x == 0 && y == 0) || (int)(1 + tiny) != 1)
> @ +        return (z);

I'm not too sure where this code is meant to be.  It looks like it 
should be part of testing |z| small, but it seems to be placed where |z| 
is large.  When |z| is large, z=0 will never happen.

> cacos*() and casin*() should benefit even more from an up-front raising
> of inexact, since do_hard_work() has 7 magic statements to raise inexact
> where sum_squares has only 1.

Where is the code that raises inexact up-front?

> There are no magic expressions (int)(1+tiny) left except the new up-front
> one.  There are still not-so- magic expressions (m_pi_2 + tiny).  BTW,
> most or all of the recent fixes to use the latter expressions don't
> have a comment about raising inexact in catrig.c, while most or all
> older expressions for setting inexact do have such a comment.

I put the comments in.


> Previous optimization not turned off for debugging.  It is simpler now
> that it can depend on inexact being raised up-front.

Ditto.  Which code turns on inexact up front?

> @      } else
> @          rx = log1pf(4*ax / sum_squares(ax-1, ay)) / 4;
> @ @ -    if (ax == 1) {
> @ -        if (ay==0)
> @ -            ry = 0;
> @ -        else
> @ -            ry = atan2f(2, -ay) / 2;
> @ -    } else if (ay < FOUR_SQRT_MIN) {
> @ -        if ((int)ay==0)
> @ -            ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
> @ -    } else
> @ +    if (ax == 1)
> @ +        ry = atan2f(2, ay) / 2;
> @ +    else if (ay < FOUR_SQRT_MIN)
> @ +        ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
> @ +    else
> @          ry = atan2f(2*ay, (1-ax)*(1+ax) - ay*ay) / 2;
> @
>
> Remove the special case for (ax == 1, ay == 0).  The general case gives
> the same result.

I don't think your code works.  It should be ry = atan2f(2, -ay) / 2, 
not ry = atan2f(2, ay) / 2.

In your tests, you should include cases where x or y is equal or close 
to 1.  These are important special cases that I think your test code is 
very unlikely to hit.  These are difficult edge cases for all the 
arc-trig functions.

> Remove negation of ay for ax == 1.  The sign will be copied into the result
> later for all cases, so it doesn't matter in the arg.  I didn't check the
> branch cut details for this, but runtime tests passed.

See above.

> ...  I now understand what the threshold should be.  You have
> filtered out ax == 1.  This makes 1 - ax*ax at least ~2*EPSILON, so
> ay*ay can be dropped if ay is less than sqrt(2*EPSILON*EPSILON) *
> 2**-GUARD_DIGITS = EPSILON * 2**-5 say.  SQRT_MIN is way smaller
> than that, so FOUR_SQRT_MIN works too.  We should use a larger
> threshold for efficiency, or avoid the special case for ax == 1.
> Testing shows that this analysis is off by a factor of about
> sqrt(EPSILON), since a threshold of EPSILON * 2**7 is optimal.
> The optimization made no difference to speed; it is just an
> optimization for understanding.  Maybe the special case for ax == 1
> can be avoided, or folded together with the same special case for
> evaluation of the real part.  This special case is similar to the
> one in clog(), but easier.

This was one of the clever ideas in the paper by Hull et al, which I 
only understood recently.  Their code was closer to your approach, I think.

Let me think about what you wrote some more.

>
> Further optimization: in sum_squares(), y is always ay >= 0, so there
> is no need to apply fabs*() to it.  I think the compiler does this
> optimization.  It can see that y == ay via the inline.

Well spotted.


From owner-freebsd-numerics@FreeBSD.ORG  Mon Sep 17 22:59:52 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 5BB3A10657E6
	for <freebsd-numerics@freebsd.org>;
	Mon, 17 Sep 2012 22:59:50 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 7934F8FC18
	for <freebsd-numerics@freebsd.org>;
	Mon, 17 Sep 2012 22:59:48 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8HMxlQ8084614 for <freebsd-numerics@freebsd.org>;
	Mon, 17 Sep 2012 17:59:47 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <5057AB63.7040606@missouri.edu>
Date: Mon, 17 Sep 2012 17:59:47 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: freebsd-numerics@freebsd.org
References: <5017111E.6060003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu>
In-Reply-To: <5057A932.3000603@missouri.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Sep 2012 22:59:52 -0000

On 09/17/2012 05:50 PM, Stephen Montgomery-Smith wrote:

> In your tests, you should include cases where x or y is equal or close
> to 1.  These are important special cases that I think your test code is
> very unlikely to hit.  These are difficult edge cases for all the
> arc-trig functions.

And just to be sure, x or y is equal or close to -1 as well.


From owner-freebsd-numerics@FreeBSD.ORG  Tue Sep 18 04:02:21 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 935B41065672
	for <freebsd-numerics@freebsd.org>;
	Tue, 18 Sep 2012 04:02:21 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 3335E8FC12
	for <freebsd-numerics@freebsd.org>;
	Tue, 18 Sep 2012 04:02:20 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8I42JSt005318 for <freebsd-numerics@freebsd.org>;
	Mon, 17 Sep 2012 23:02:19 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <5057F24B.7020605@missouri.edu>
Date: Mon, 17 Sep 2012 23:02:19 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: freebsd-numerics@freebsd.org
References: <5017111E.6060003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
	<5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu>
In-Reply-To: <5057A932.3000603@missouri.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Sep 2012 04:02:21 -0000

On 09/17/2012 05:50 PM, Stephen Montgomery-Smith wrote:

>> cacos*() and casin*() should benefit even more from an up-front raising
>> of inexact, since do_hard_work() has 7 magic statements to raise inexact
>> where sum_squares has only 1.
>
> Where is the code that raises inexact up-front?

I don't see why having code upfront will make it much more efficient. 
Out of these 7 magic statements, at most two of them will be called.

But I could put something like

if ((x == 0 && y == 0) || (x == 0 && y == 1) || (int)(1+tiny) == 1) {
........
at the beginning of do_hard_work and catanh.


>> ...  I now understand what the threshold should be.  You have
>> filtered out ax == 1.  This makes 1 - ax*ax at least ~2*EPSILON, so
>> ay*ay can be dropped if ay is less than sqrt(2*EPSILON*EPSILON) *
>> 2**-GUARD_DIGITS = EPSILON * 2**-5 say.  SQRT_MIN is way smaller
>> than that, so FOUR_SQRT_MIN works too.  We should use a larger
>> threshold for efficiency, or avoid the special case for ax == 1.
>> Testing shows that this analysis is off by a factor of about
>> sqrt(EPSILON), since a threshold of EPSILON * 2**7 is optimal.
>> The optimization made no difference to speed; it is just an
>> optimization for understanding.  Maybe the special case for ax == 1
>> can be avoided, or folded together with the same special case for
>> evaluation of the real part.  This special case is similar to the
>> one in clog(), but easier.

OK, I think I made changes more or less according to your suggestions.

In the case A < A_crossover, a threshold like 
DBL_EPSILON*DBL_EPSILON/128 is required.  I think the one you set is too 
large.  It is important that sqrt(x) + x/2 is sqrt(x).  (Again I don't 
think your tests would pick this up, because you need to do a lot of 
tests where y is close to or equal to 1.)


From owner-freebsd-numerics@FreeBSD.ORG  Tue Sep 18 06:19:26 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C0E491065673
	for <freebsd-numerics@freebsd.org>;
	Tue, 18 Sep 2012 06:19:26 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au
	[211.29.132.187])
	by mx1.freebsd.org (Postfix) with ESMTP id 451178FC16
	for <freebsd-numerics@freebsd.org>;
	Tue, 18 Sep 2012 06:19:25 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8I6JMOn031723
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Tue, 18 Sep 2012 16:19:23 +1000
Date: Tue, 18 Sep 2012 16:19:22 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <5057A932.3000603@missouri.edu>
Message-ID: <20120918150551.Y820@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <5048D00B.8010401@missouri.edu>
	<504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Sep 2012 06:19:26 -0000

On Mon, 17 Sep 2012, Stephen Montgomery-Smith wrote:

> OK, I am struggling a bit with the latest suggestions.
>
> First, I have completely removed all the code related to when |z| is small. 
> I have just lost it all.  So I didn't perform any changes related to that 
> code.  If you want me to put it back with appropriate "#if 0", can you email 
> those code segments back to me?

I have not :-).  It is also quoted in the mail archives.  Will sent it in
my next patch.

> On 09/17/2012 12:07 PM, Bruce Evans wrote:
>
>> @ @@ -321,30 +334,28 @@
>> @      }
>> @ @ -    if (isinf(x) || isinf(y))
>> @ -        return (cpackf(copysignf(0, x), copysignf(m_pi_2 + tiny, y)));
>> @ +    /* Raise inexact unless z == 0; return for z == 0 as a side
>> effect. */
>> @ +    if ((x == 0 && y == 0) || (int)(1 + tiny) != 1)
>> @ +        return (z);
>
> I'm not too sure where this code is meant to be.  It looks like it should be 
> part of testing |z| small, but it seems to be placed where |z| is large. 
> When |z| is large, z=0 will never happen.

As its comment says, this raises inexact [up front] unless z == 0, and
[so that the test is not optimized away] returns [z] for z == 0 as a side
effect.

This is for any z that has not been previously classified (mainly ones
with NaNs).  Its operation is:
- z == 0: find x == 0 and y == 0 and return z
- z != 0: find !(x == 0 and y == 0); evaluate (int)(1 + tiny) != 1 and
   find it to be false while raising inexact; don't return z, but continue
   with inexact set.

>> cacos*() and casin*() should benefit even more from an up-front raising
>> of inexact, since do_hard_work() has 7 magic statements to raise inexact
>> where sum_squares has only 1.
>
> Where is the code that raises inexact up-front?

As quoted above.

Later I tried removing all the 7 magic statements in do_hard_work(), without
adding code like the above.  This made very little difference.  OTOH, the
above code costs a cycle or 2, and removing the additions in all magic
expressions (m_pi_2 + tiny) gave a small improvement.  I think I can
explain this, and it shows that we should be using fenv (optimized) and
not "optimizing" using magic constext-sensitive expressions.  The point
is that the code that sets inexact can run in parallel so that the main
path can run faster because it doesn't involve an operation like
(m_pi_2 + tiny).

Good ways for raising exceptions:

FE_INEXACT:
Your if '((int)(1 + tiny) == 1) return (foo);' works well.  This depends
on the branch being predictable.  But returning is inconvenient.  I
hope if '((int)(1 + tiny) == 1) volatile_variable = 0;' works similarly.
This could be in feraiseexcept(FE_INEXACT) or in a more primitive
raise_inexact() (the latter is less verbose and easier to optimize).
Then if you actually want to return, the code would be something like
{ raise_inexact(); return (m_pi_2); }.  Better, the branch can be
avoided using something like `volatile_variable = (int)(1 + tiny);'.
Better still, write this in asm and just do `(int)0.5;' (use asm only
to avoid the optimizer removing this).  Possibly better still, use a
purer FP operation since conversion to int can be slow.

In the above, we don't really want a special case for z == 0; we need
branches to classify this case but should skip the return since returns
use branch resources too.  The code becomes:

 	if (x != 0 || y != 0)
 		raise_inexact();	/* No comment. */
#if !THIS_CODE_INTENTIONALLY_LEFT_OUT
 	else
 		return (z);		/* No comment. */
#endif

FE_OVERFLOW:
Instead of evaluating huge*huge and returning it, use something like
`volatile_variable = huge*huge; return (INFINITY);'.  This is more
natural than the above, so it takes at most 1 more instruction
(assignment to variable with no dependents) and thus loses little
even without parallelism.  The version written in asm can also avoid
the assignment (just evaluate huge*huge) and lose nothing.

FE_UNDERFLOW:
Instead of evaluating tiny*tiny and returning it, use something like
`volatile_variable = tiny*tiny; return (0);'.  I hope there is a
variation on this that raises underflow at full speed (underflowing
cases are very slow on core2 although not on Athlon64; hopefully they
are not so slow if the result of tiny*tiny is not used).

The last 2 raisings will also fix the i386 bug that huge*huge and
tiny*tiny don't actually raise overflow or underflow or return infinity
or 0, since they are evaluated in extra exponent range.  It takes
conversion to double or float to trigger the exception and to give
the correct value.  When we try to raise exceptions in a parallel
code path, we are hoping for related asynchronicities in the setting
of the exception flags so that the usual case where the exception
flags are not tested soon proceeds at full speed.  It is unclear
how compilers and CPUs produce the ordering of operations required
by the abstract C machines -- I think a strict interpretation of
`volatile' would require synchronizing everything for every access
to a volatile variable, but that would be too slow and I've never
seen compilers doing much synchronization.

>> @      } else
>> @          rx = log1pf(4*ax / sum_squares(ax-1, ay)) / 4;
>> @ @ -    if (ax == 1) {
>> @ -        if (ay==0)
>> @ -            ry = 0;
>> @ -        else
>> @ -            ry = atan2f(2, -ay) / 2;
>> @ -    } else if (ay < FOUR_SQRT_MIN) {
>> @ -        if ((int)ay==0)
>> @ -            ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
>> @ -    } else
>> @ +    if (ax == 1)
>> @ +        ry = atan2f(2, ay) / 2;
>> @ +    else if (ay < FOUR_SQRT_MIN)
>> @ +        ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
>> @ +    else
>> @          ry = atan2f(2*ay, (1-ax)*(1+ax) - ay*ay) / 2;
>> @
>> 
>> Remove the special case for (ax == 1, ay == 0).  The general case gives
>> the same result.
>
> I don't think your code works.  It should be ry = atan2f(2, -ay) / 2, not ry 
> = atan2f(2, ay) / 2.

Only logically.  As I explained, the negation makes no difference to the
result, but of course takes longer, so I removed it.

> In your tests, you should include cases where x or y is equal or close to 1. 
> These are important special cases that I think your test code is very 
> unlikely to hit.  These are difficult edge cases for all the arc-trig 
> functions.

Hmm, I only did this carefully for clog().  I happen to have been testing
lots of cases ctanh(1 + tiny, tiny') where tiny* is really tiny (denormal)
with either sign, but not so many cases ctanh(1, tiny') and no (?) cases
of ctanh(1 + tiny, +-0).

>> Remove negation of ay for ax == 1.  The sign will be copied into the result
>> later for all cases, so it doesn't matter in the arg.  I didn't check the
>> branch cut details for this, but runtime tests passed.
>
> See above.

I might have missed this.  But if the sign matters, why do you set ry = +0
for catanh on both sides of 1 + I*(+-0)?

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Tue Sep 18 06:41:57 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 85CA6106566B
	for <freebsd-numerics@freebsd.org>;
	Tue, 18 Sep 2012 06:41:57 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail13.syd.optusnet.com.au (mail13.syd.optusnet.com.au
	[211.29.132.194])
	by mx1.freebsd.org (Postfix) with ESMTP id 128AC8FC12
	for <freebsd-numerics@freebsd.org>;
	Tue, 18 Sep 2012 06:41:56 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail13.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8I6frVm020384
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Tue, 18 Sep 2012 16:41:54 +1000
Date: Tue, 18 Sep 2012 16:41:53 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <5057F24B.7020605@missouri.edu>
Message-ID: <20120918162105.U991@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <504D3CCD.2050006@missouri.edu>
	<504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu>
	<5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
	<5057F24B.7020605@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Sep 2012 06:41:57 -0000

On Mon, 17 Sep 2012, Stephen Montgomery-Smith wrote:

> On 09/17/2012 05:50 PM, Stephen Montgomery-Smith wrote:
>
>>> cacos*() and casin*() should benefit even more from an up-front raising
>>> of inexact, since do_hard_work() has 7 magic statements to raise inexact
>>> where sum_squares has only 1.
>> 
>> Where is the code that raises inexact up-front?
>
> I don't see why having code upfront will make it much more efficient. Out of 
> these 7 magic statements, at most two of them will be called.

7 instead of 1 is more complex, and uses more branch prediction resources.

> But I could put something like
>
> if ((x == 0 && y == 0) || (x == 0 && y == 1) || (int)(1+tiny) == 1) {
> ........
> at the beginning of do_hard_work and catanh.

I put without (x == 0 && y == 1) in catanh().  (x == 0 && y == 1) in it
is a bug, since catanh(I) = I*Pi/2 with inexact.  However, I seemed to
have missed (x == 1 && y == 0) -> catanh(1) = +Inf without inexact.

do_hard_work() is too late for this, since the following earlier cases
also need it:
- large x or y (neither infinite)
- small x and y (not both 0, except for acosh(0) = Pi/2 with inexact, etc.)
   (the lost optimization).

> OK, I think I made changes more or less according to your suggestions.
>
> In the case A < A_crossover, a threshold like DBL_EPSILON*DBL_EPSILON/128 is 
> required.  I think the one you set is too large.  It is important that 
> sqrt(x) + x/2 is sqrt(x).  (Again I don't think your tests would pick this 
> up, because you need to do a lot of tests where y is close to or equal to 1.)

Well, there were 2**12 of them with y = 1+denormal, with 7 different
denormals, but none with y = 1.  Will test some more.  (I'm testing
denormals with a few 1's in their lower bits since experience shows
that values with 0's in their lower bits are too special.  For example,
ax*ax is exact if enough lower bits in ax are 0.)

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Tue Sep 18 14:15:49 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id AD6ED106566B
	for <freebsd-numerics@FreeBSD.org>;
	Tue, 18 Sep 2012 14:15:49 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail13.syd.optusnet.com.au (mail13.syd.optusnet.com.au
	[211.29.132.194])
	by mx1.freebsd.org (Postfix) with ESMTP id 09B938FC19
	for <freebsd-numerics@FreeBSD.org>;
	Tue, 18 Sep 2012 14:15:48 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail13.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8IEFeHX012910
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 19 Sep 2012 00:15:46 +1000
Date: Wed, 19 Sep 2012 00:15:40 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
In-Reply-To: <20120918162105.U991@besplex.bde.org>
Message-ID: <20120918232850.N2144@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Stephen Montgomery-Smith <stephen@missouri.edu>,
	freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Sep 2012 14:15:50 -0000

On Tue, 18 Sep 2012, Bruce Evans wrote:

> On Mon, 17 Sep 2012, Stephen Montgomery-Smith wrote:
> ...
>> But I could put something like
>> 
>> if ((x == 0 && y == 0) || (x == 0 && y == 1) || (int)(1+tiny) == 1) {
>> ........
>> at the beginning of do_hard_work and catanh.
>
> I put without (x == 0 && y == 1) in catanh().  (x == 0 && y == 1) in it
> is a bug, since catanh(I) = I*Pi/2 with inexact.  However, I seemed to
> have missed (x == 1 && y == 0) -> catanh(1) = +Inf without inexact.

I also broke cases with infinities...

>> In the case A < A_crossover, a threshold like DBL_EPSILON*DBL_EPSILON/128 
>> is required.  I think the one you set is too large.  It is important that 
>> sqrt(x) + x/2 is sqrt(x).  (Again I don't think your tests would pick this 
>> up, because you need to do a lot of tests where y is close to or equal to 
>> 1.)
>
> Well, there were 2**12 of them with y = 1+denormal, with 7 different
> denormals, but none with y = 1.  Will test some more.  (I'm testing

... and many cases with ax or ay precisely 1, due to not testing these.

Fixing these and finding a few more simplifications and optimizations
gives:

@ diff -u2 catrig.c~ catrig.c
@ --- catrig.c~	2012-09-18 03:42:32.000000000 +0000
@ +++ catrig.c	2012-09-18 11:53:28.017331000 +0000
@ @@ -261,4 +261,6 @@
@ 
@  /*
@ + * casinh(z) = z + O(|z|^3)   as z -> 0
@ + *
@   * casinh(z) = sign(x)*clog(sign(x)*z) + O(1/|z|^2)   as z -> infinity
@   * The above formula works for the imaginary part as well, because

Part of restoring your old optimization -- fix the comments.

@ @@ -297,4 +299,5 @@
@ 
@  	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON) {
@ +		/* clog...() will raise inexact unless x or y is infinite. */
@  		if (signbit(x) == 0)
@  			w = clog_for_large_values(z) + m_ln2;

Further minimal changes for the double precision case -- try to document
all magic for setting inexact.

@ @@ -304,4 +307,8 @@
@  	}
@ 
@ +	if (ax < DBL_EPSILON && ay < DBL_EPSILON)
@ +		if ((int)ax==0 && (int)ay==0) /* raise inexact */
@ +			return (z);
@ +
@  	do_hard_work(ax, ay, &rx, &B_is_usable, &B, &sqrt_A2my2, &new_y);
@  	if (B_is_usable)

Your old optimization.  Not done as completely as in float precision.

@ @@ -328,4 +335,6 @@
@   * close to 1.
@   *
@ + * cacos(z) = PI/2 - z + O(|z|^3)   as z -> 0
@ + *
@   * cacos(z) = -sign(y)*I*clog(z) + O(1/|z|^2)   as z -> infinity
@   * The above formula works for the real part as well, because
@ @@ -355,6 +364,6 @@
@  		if (isinf(y))
@  			return (cpack(x+x, -y));
@ -		/* cacos(0 + I*NaN) = PI/2 + I*NaN */
@ -		if (x == 0) return (cpack(m_pi_2 + tiny, y+y)); /* raise inexact */
@ +		/* cacos(0 + I*NaN) = PI/2 + I*NaN with inexact */
@ +		if (x == 0) return (cpack(m_pi_2 + tiny, y+y));
@  		/*
@  		 * All other cases involving NaN return NaN + I*NaN.

Comments about exceptions raised should be together with comments about
values returned, at least if we can't attach them closely to the magic
that raises them.

@ @@ -366,4 +375,5 @@
@ 
@  	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON) {
@ +		/* clog...() will raise inexact unless x or y is infinite. */
@  		w = clog_for_large_values(z);
@  		rx = fabs(cimag(w));
@ @@ -374,4 +384,7 @@
@  	}
@ 
@ +	if (ax < DBL_EPSILON && ay < DBL_EPSILON)
@ +		return (cpack(m_pi_2 + tiny - x, -y));	/* raise inexact */
@ +
@  	do_hard_work(ay, ax, &ry, &B_is_usable, &B, &sqrt_A2mx2, &new_x);
@  	if (B_is_usable) {

Your old optimization, updated to raise inexact by adding tiny.  Not
updated to avoid subtracting x -- see the float precision code for that.
The fixed comment above goes with this subtraction -- without it, the
approximation would be Pi/2 + O(z).

@ @@ -517,4 +530,6 @@
@   *             + I * atan2(2*y, (1-x)*(1+x)-y*y) / 2
@   *
@ + * catanh(z) = z + O(|z|^3)   as z -> 0
@ + *
@   * catanh(z) = 1/z + sign(y)*I*PI/2 + O(1/|z|^3)   as z -> infinity
@   * The above formula works for the real part as well, because
@ @@ -536,7 +551,7 @@
@  		if (isinf(x))
@  			return (cpack(copysign(0, x), y+y));
@ -		/* catanh(NaN + I*+-Inf) = sign(NaN)0 + I*+-PI/2 */
@ +		/* catanh(NaN + I*+-Inf) = sign(NaN)0 + I*+-PI/2 with inexact */
@  		if (isinf(y))
@ -			return (cpack(copysign(0, x), copysign(m_pi_2 + tiny, y))); /* raise inexact */
@ +			return (cpack(copysign(0, x), copysign(m_pi_2 + tiny, y)));
@  		/* catanh(+-0 + I*NaN) = +-0 + I*NaN */
@  		if (x == 0)
@ @@ -550,4 +565,5 @@
@  	}
@ 
@ +	/* XXX should improve following comments. */
@  	/* If x or y is inf, then catanh(x + I*y) = 0 + I*sign(y)*PI/2 */
@  	if (isinf(x) || isinf(y))

Here there was no space for commenting about the exceptions.  The sign of
the 0 is not documented, but there is no space for that either.  It is
in the code as a copysign().  So is the sign for PI/2, but that is in the
comment too.

@ @@ -557,6 +573,10 @@
@  		return (cpack(copysign(real_part_reciprocal(ax, ay), x), copysign(m_pi_2 + tiny, y))); /* raise inexact */
@ 
@ +	if (ax < DBL_EPSILON && ay < DBL_EPSILON)
@ +		if ((int)ax==0 && (int)ay==0) /* raise inexact */
@ +			return (z);
@ +

Your old optimization.  It also improves accuacy significantly -- see the
float precision comment.

@  	if (ax == 1 && ay < DBL_EPSILON) {
@ -		if ((int)ay==0) { /* raise inexact */
@ +		if (1) { /* inexact will be raised by log() */
@  			/*
@  			 * If ay == 0, divide-by-zero will be (correctly)

I didn't re-indent this.

@ diff -u2 catrigf.c~ catrigf.c
@ --- catrigf.c~	2012-09-18 03:42:35.000000000 +0000
@ +++ catrigf.c	2012-09-18 13:23:20.972740000 +0000
@ @@ -165,4 +165,9 @@
@  	}
@ 
@ +	/* XXX the numbers are related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < 2048 * FLT_EPSILON && ay < 2048 * FLT_EPSILON)
@ +		if ((int)ax==0 && (int)ay==0)
@ +			return (z);
@ +
@  	do_hard_work(ax, ay, &rx, &B_is_usable, &B, &sqrt_A2my2, &new_y);
@  	if (B_is_usable)

Old optimization refined.

@ @@ -213,4 +218,8 @@
@  	}
@ 
@ +	/* XXX the number for ay is related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < FLT_EPSILON / 8 && ay < 2048 * FLT_EPSILON)
@ +		return (cpackf(m_pi_2 + tiny, -y));
@ +
@  	do_hard_work(ay, ax, &ry, &B_is_usable, &B, &sqrt_A2mx2, &new_x);
@  	if (B_is_usable) {

Old optimization refined.

@ @@ -277,7 +286,5 @@
@  {
@  	if (y < SQRT_MIN)
@ -		if ((int)y==0)
@ -			return (x*x);
@ -
@ +		return (x*x);
@  	return (x*x + y*y);
@  }

Depend on the up-front setting of inexact here in sum_squares().

@ @@ -288,4 +295,6 @@
@  	int ex, ey;
@ 
@ +	if (isinf(x) || isinf(y))
@ +		return (0);
@  	if (y == 0) return (1/x);
@  	if (x == 0) return (x/y/y);

Handle special case for infinities here in real_part_reciprocal() instead
of the general code.

@ @@ -319,29 +328,60 @@
@  	}
@ 
@ -	if (isinf(x) || isinf(y))
@ -		return (cpackf(copysignf(0, x), copysignf(m_pi_2 + tiny, y)));

Move this into real_part_reciprocal(), so that the classification of
infinities is only done if x or y is large.  This is a minor optimization.
My previous version removed this, but was broken since
real_part_reciprocal() somehow doesn't naturally return 0 for infinities.
It does rather slow scaling steps.

@ +	/*
@ +	 * Handle the annoying special case +-1 + I*+-0, and collaterally
@ +	 * handle the not-so-special case y == 0.  C99 specifies that
@ +	 * catanh(+-1 + I*+-0) = +-Inf + I*+-0 instead of the limiting
@ +	 * value +-Inf + I*+-PI/2 since it wants y == 0 to give the same
@ +	 * result as the real atanh() (at least for y == +0).  The special
@ +	 * behaviour for +-1 + I*+-0 begins with classifying it to avoid
@ +	 * raising inexact for it.  Make the classification as simple and
@ +	 * short as possible (except for this comment about it) and ensure
@ +	 * identical results by calling the real atanh() for all non-NaN x
@ +	 * when y == 0.  This turns out to be significantly more accurate.
@ +	 *
@ +	 * TODO: move this before the NaN classification and let atanh()
@ +	 * handle NaN x too.  Make a similar special case for x == 0 to
@ +	 * improve accuracy; this takes no extra lines of code since it
@ +	 * removes the need to handle x == 0 under the NaN classification.
@ +	 */
@ +	if (y == 0)
@ +		return (cpackf(atanh(x), y));

See the comment.

@ +
@ +	/* Raise inexact unless z == 0; return for z == 0 as a side effect. */
@ +	if ((x == 0 && y == 0) || (int)(1 + tiny) != 1)
@ +		return (z);

z == 0 is the only remaining case that shouldn't raise inexact.

@ 
@  	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
@ -		return (cpackf(copysignf(real_part_reciprocal(ax, ay), x), copysignf(m_pi_2 + tiny, y)));
@ +		return (cpackf(copysignf(real_part_reciprocal(ax, ay), x), copysignf(m_pi_2, y)));

Inexact was raised up-front.

@ 
@ -	if (ax == 1 && ay < FLT_EPSILON) {
@ -		if ((int)ay==0) {
@ -			if ( ilogbf(ay) > FLT_MIN_EXP)
@ -				rx = - logf(ay/2) / 2;
@ -			else
@ -				rx = - (logf(ay) - m_ln2) / 2;
@ -		}
@ -	} else

Inexact was raised up front, and will also be raised by logf().

ilogbf() is rather slow, though it is now a builtin.  There is no need
to use it here:
- the condition can be written as (ay > 2 * FLT_MIN_EXP)
- the expression with m_ln2 is accurate enough, so there is no need for
   the condition.  I thought I tested this assertion and found no difference
   at all in accuracy, but now I can't see why it is true.  This case is
   fundamentally quite accurate -- within about 1 ulp for the logf() part --
   and subtracting m_ln2 will only lose about 0.5 ulps pf accuracy (since
   |logf(FLT_EPSILON)| dominates m_ln2), so it is not near the worse case
   for accuracy, but the loss of accuracy is not null.

@ +	/* XXX the numbers are related to sqrt(6 * FLT_EPSILON). */
@ +	if (ax < 2048 * FLT_EPSILON && ay < 2048 * FLT_EPSILON)
@ +		return (z);

Old optimization.

Not just an optimization -- see below about accuracy.

@ +
@ +	if (ax == 1 && ay < FLT_EPSILON)
@ +		rx = - (logf(ay) - m_ln2) / 2;

Above with the extra code for accuracy removed.

@ +	else
@ +		/*
@ +		 * If we didn't handle y == 0 earlier, the following for
@ +		 * y == 0 would reduce to log1pf(4*ax/(ax-1)**2)) / 4.
@ +		 * This is significantly less accurate than the expression
@ +		 * log1pf(ax+ax+(ax*ax)*x/(1-ax)) / 2 used by atanhf() for
@ +		 * ax < 0.5, though not much less accurate than the expr
@ +		 * log1pf(ax+ax/(1-ax)) / 2 used by atanhf() for 0.5 <=
@ +		 * ax <= 1.  Can we do better with ay mixed in?
@ +		 *
@ +		 * This is also significantly less accurate than the
@ +		 * expression (z) used above when ax < 2048 * FLT_EPSILON
@ +		 * and y == 0.  Presumably similarly when y is small but
@ +		 * nonzero.  This explains why the above optimization also
@ +		 * improves accuracy.
@ +		 */
@  		rx = log1pf(4*ax / sum_squares(ax-1, ay)) / 4;

See the comment.

@ 
@ -	if (ax == 1) {
@ -		if (ay == 0)
@ -			ry = 0;
@ -		else
@ -			ry = atan2f(2, -ay) / 2;
@ -	} else if (ay < FOUR_SQRT_MIN) {
@ -		if ((int)ay==0)
@ -			ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
@ -	} else
@ +	if (ax == 1)
@ +		ry = atan2(2, -ay) / 2;
@ +	else if (ay < FLT_EPSILON * 128)
@ +		ry = atan2f(2*ay, (1-ax)*(1+ax)) / 2;
@ +	else
@  		ry = atan2f(2*ay, (1-ax)*(1+ax) - ay*ay) / 2;
@

This is the part that I completely broke before.  Now it does:
- special case for ay == 0 moved above
- don't remove the minus sign in -ay
- use up-front setting of inexact
- the expanded threshold still works for me.

@ diff -u2 catrigl.c~ catrigl.c
@ --- catrigl.c~	2012-09-18 03:42:37.000000000 +0000
@ +++ catrigl.c	2012-09-18 11:50:35.362160000 +0000
@ @@ -180,4 +180,8 @@
@  	}
@ 
@ +	if (ax < LDBL_EPSILON && ay < LDBL_EPSILON)
@ +		if ((int)ax==0 && (int)ay==0)
@ +			return (z);
@ +
@  	do_hard_work(ax, ay, &rx, &B_is_usable, &B, &sqrt_A2my2, &new_y);
@  	if (B_is_usable)
@ @@ -228,4 +232,7 @@
@  	}
@ 
@ +	if (ax < LDBL_EPSILON && ay < LDBL_EPSILON)
@ +		return (cpackl(m_pi_2 + tiny - x, -y));
@ +
@  	do_hard_work(ay, ax, &ry, &B_is_usable, &B, &sqrt_A2mx2, &new_x);
@  	if (B_is_usable) {
@ @@ -340,4 +347,8 @@
@  		return (cpackl(copysignl(real_part_reciprocal(ax, ay), x), copysignl(m_pi_2 + tiny, y)));
@ 
@ +	if (ax < LDBL_EPSILON && ay < LDBL_EPSILON)
@ +		if ((int)ax==0 && (int)ay==0)
@ +			return (z);
@ +
@  	if (ax == 1 && ay < LDBL_EPSILON) {
@  		if ((int)ay==0) {

catrigl.c only has changes to restore the old optimizations.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Tue Sep 18 15:19:16 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 94BAC106566C
	for <freebsd-numerics@FreeBSD.org>;
	Tue, 18 Sep 2012 15:19:16 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au
	[211.29.132.183])
	by mx1.freebsd.org (Postfix) with ESMTP id 2285B8FC12
	for <freebsd-numerics@FreeBSD.org>;
	Tue, 18 Sep 2012 15:19:15 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8IFJC2E008798
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 19 Sep 2012 01:19:14 +1000
Date: Wed, 19 Sep 2012 01:19:12 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
In-Reply-To: <20120918232850.N2144@besplex.bde.org>
Message-ID: <20120919010613.T2493@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Stephen Montgomery-Smith <stephen@missouri.edu>,
	freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Sep 2012 15:19:16 -0000

On Wed, 19 Sep 2012, I wrote:

> @ +	/*
> @ +	 * Handle the annoying special case +-1 + I*+-0, and collaterally
> @ +	 * handle the not-so-special case y == 0.  C99 specifies that
> @ +	 * catanh(+-1 + I*+-0) = +-Inf + I*+-0 instead of the limiting
> @ +	 * value +-Inf + I*+-PI/2 since it wants y == 0 to give the same
> @ +	 * result as the real atanh() (at least for y == +0).  The special
> @ +	 * behaviour for +-1 + I*+-0 begins with classifying it to avoid
> @ +	 * raising inexact for it.  Make the classification as simple and
> @ +	 * short as possible (except for this comment about it) and ensure
> @ +	 * identical results by calling the real atanh() for all non-NaN x
> @ +	 * when y == 0.  This turns out to be significantly more accurate.
> @ +	 *
> @ +	 * TODO: move this before the NaN classification and let atanh()
> @ +	 * handle NaN x too.  Make a similar special case for x == 0 to
> @ +	 * improve accuracy; this takes no extra lines of code since it
> @ +	 * removes the need to handle x == 0 under the NaN classification.
> @ +	 */
> @ +	if (y == 0)
> @ +		return (cpackf(atanh(x), y));
>
> See the comment.

Duh, this has to be under (y == 0 && ax <= 1) so that the real function
actually applies.

> @ @ -	if (ax == 1 && ay < FLT_EPSILON) {
> @ -		if ((int)ay==0) {
> @ -			if ( ilogbf(ay) > FLT_MIN_EXP)
> @ -				rx = - logf(ay/2) / 2;
> @ -			else
> @ -				rx = - (logf(ay) - m_ln2) / 2;
> @ -		}
> @ -	} else
>
> Inexact was raised up front, and will also be raised by logf().
>
> ilogbf() is rather slow, though it is now a builtin.  There is no need
> to use it here:
> - the condition can be written as (ay > 2 * FLT_MIN_EXP)
> - the expression with m_ln2 is accurate enough, so there is no need for
>  the condition.  I thought I tested this assertion and found no difference
>  at all in accuracy, but now I can't see why it is true.  This case is
>  fundamentally quite accurate -- within about 1 ulp for the logf() part --
>  and subtracting m_ln2 will only lose about 0.5 ulps pf accuracy (since
>  |logf(FLT_EPSILON)| dominates m_ln2), so it is not near the worse case
>  for accuracy, but the loss of accuracy is not null.

Now tested.  The increase in inaccuracy is only from ~0.7 ulps to ~0.9 ulps.
This is acceptable.

> @ ...
> @ +
> @ +	if (ax == 1 && ay < FLT_EPSILON)
> @ +		rx = - (logf(ay) - m_ln2) / 2;

However, the outer FLT_EPSILON threshold for the above is too conservative.
It can be increased to FLT_EPSILON**2 without expanding the error to above
0.7 ulps, provided this optimization is not used -- with the exanded
threshold, this optimization expands the error by another 0.2 ulps, to ~1.1
ulps instead of to ~0.9 ulps.  These errors are still in the noise compared
with the worst case error of ~2.6 ulps, but it is good to keep errors
nelow 1 ulp if this is easy.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Wed Sep 19 03:48:42 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id E67F31065670
	for <freebsd-numerics@freebsd.org>;
	Wed, 19 Sep 2012 03:48:42 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 9F59D8FC1C
	for <freebsd-numerics@freebsd.org>;
	Wed, 19 Sep 2012 03:48:42 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8J3mYxL036898; Tue, 18 Sep 2012 22:48:35 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <50594092.6000302@missouri.edu>
Date: Tue, 18 Sep 2012 22:48:34 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <5048D00B.8010401@missouri.edu>
	<504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu>
	<20120918150551.Y820@besplex.bde.org>
In-Reply-To: <20120918150551.Y820@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Sep 2012 03:48:43 -0000

On 09/18/2012 01:19 AM, Bruce Evans wrote:
> On Mon, 17 Sep 2012, Stephen Montgomery-Smith wrote:

>> I don't think your code works.  It should be ry = atan2f(2, -ay) / 2,
>> not ry = atan2f(2, ay) / 2.
>
> Only logically.  As I explained, the negation makes no difference to the
> result, but of course takes longer, so I removed it.

No, they give different results.  atan2(y,x) = Pi - atan2(y,-x) if y is 
positive.


From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 01:41:06 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B3F611065673
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 01:41:06 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 6F3B68FC17
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 01:41:06 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8L1eww2078803; Thu, 20 Sep 2012 20:40:59 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505BC5AA.2030604@missouri.edu>
Date: Thu, 20 Sep 2012 20:40:58 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde! .org>
In-Reply-To: <20120918232850.N2144@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 01:41:06 -0000

On 09/18/2012 09:15 AM, Bruce Evans wrote:

> @      if (ax == 1 && ay < DBL_EPSILON) {
> @ -        if ((int)ay==0) { /* raise inexact */
> @ +        if (1) { /* inexact will be raised by log() */
> @              /*
> @               * If ay == 0, divide-by-zero will be (correctly)
>
> I didn't re-indent this.

I have put back the old optimizations in catrig.c.  The only change I 
have made so far is that I did re-indent this.

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 02:50:04 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A685D106566C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 02:50:04 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 616288FC14
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 02:50:04 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8L2o2gY084873; Thu, 20 Sep 2012 21:50:02 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505BD5DA.1070302@missouri.edu>
Date: Thu, 20 Sep 2012 21:50:02 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <504FF726.9060001@missouri.edu>
	<20120912191556.F1078@besplex.bde.org>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde! .org>
In-Reply-To: <20120918232850.N2144@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 02:50:04 -0000

What I did was to make constants called SQRT_6_EPSILON, etc, and then 
make your suggested optimizations to float also to double and long double.

I also wrote my own atanhl function so that your inexact optimizations 
could be applied to long double as well as double and float.

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 03:06:31 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 632C0106566C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 03:06:31 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 1D27B8FC16
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 03:06:29 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8L36SZH086228; Thu, 20 Sep 2012 22:06:29 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505BD9B4.8020801@missouri.edu>
Date: Thu, 20 Sep 2012 22:06:28 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
In-Reply-To: <20120919010613.T2493@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 03:06:31 -0000

I also added inexact optimizations for casinh and cacos.

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 07:23:39 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 800CE1065728
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 07:23:39 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail12.syd.optusnet.com.au (mail12.syd.optusnet.com.au
	[211.29.132.193])
	by mx1.freebsd.org (Postfix) with ESMTP id 0D2B78FC1C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 07:23:38 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail12.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8L7NTek006237
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 21 Sep 2012 17:23:31 +1000
Date: Fri, 21 Sep 2012 17:23:29 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505BD5DA.1070302@missouri.edu>
Message-ID: <20120921161532.R945@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120912225847.J1771@besplex.bde.org>
	<50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org>
	<5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde!
	.org> <505BD5DA.1070302@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 07:23:39 -0000

On Thu, 20 Sep 2012, Stephen Montgomery-Smith wrote:

> What I did was to make constants called SQRT_6_EPSILON, etc, and then make 
> your suggested optimizations to float also to double and long double.

I'm just using SQRT_EPSILON here in all cases.  Partial diffs between
your new version of catrig.c and my unmerged version.

50,51c52
< SQRT_3_EPSILON =	sqrt(3*DBL_EPSILON),
< SQRT_6_EPSILON =	sqrt(6*DBL_EPSILON),
---
> SQRT_EPSILON =		0x1p-27,	/* <= sqrt(DBL_EPSILON) */

Your version depends on sqrt(N*DBL_EPSILON) being a constant enough
expression for it to be evaluated at compile time.  A fairly large
optimization.

e_atanh.c uses 2**-28 here.  This is significantly smaller than any
of the above.  It has a large saftely margin.  But not after dividing
the above by 8.

e_atanhf.c uses the same 2**-28 here.  This is nonsense.  Properly
translating the 2**-28 to float precision would have given about
2**-14.  Exhaustive testing shows that 2**-13 gives the same results.
sqrtf(FLT_EPSILON) is much larger (2**-11.5).  That has a negative
safety margin -- exhaustive testing shows that 2**-12 loses a little
bit of accuracy compared with 2**-13.

For catanh*(), we have to bound both x and y, and should have a larger
safety margin for both.  Non-exhaustive testing shows that 2**-12 works
OK in float precision.  My previous values had a negative safety marging.
In double precision, sqrt(DBL_EPSILON) is not an integer power of 2,
and the above gives an additional safety margin by rounding down to an
integer power of 2.

304,309c313,315
< ...
< 	if (ax < SQRT_6_EPSILON/8 && ay < SQRT_6_EPSILON/8)
< 		return (z);
---
> 	if (ax < SQRT_EPSILON && ay < SQRT_EPSILON)
> 		if ((int)ax == 0 && (int)ay == 0) /* raise inexact */
> 			return (z);

The divisions by 8 give a larger safety margin than my version.

384,389c390,407
< 	if (ax < DBL_EPSILON/8 && ay < SQRT_6_EPSILON/8)
< 		return (cpack(m_pi_2, -y));
---
> 	if (ax < SQRT_EPSILON && ay < SQRT_EPSILON)
> 		if ((int)ax == 0 && (int)ay == 0)
> 			return (cpack(pio2_hi - (x - pio2_lo), -y));

I restored your z term in the approximation so that I could use the
same threshold for x and y.  This is more accurate and covers more
cases. The approximation is now _better_ than the corresponding one
in acos*() -- they should be using the extra term too.  This has
other subtlties involving rounding of Pi/2 -- see later mail.

580,581c596,598
< 	if (ax < SQRT_3_EPSILON/8 && ay < SQRT_3_EPSILON/8)
< 		return (z);
---
> 	if (ax < SQRT_EPSILON && ay < SQRT_EPSILON)
> 		if ((int)ax == 0 && (int)ay == 0) /* raise inexact */
> 			return (z);
>
> I also wrote my own atanhl function so that your inexact optimizations could 
> be applied to long double as well as double and float.

Hmm, I didn't notice that atanhl() was missing.

I found that atanh[f] uses an inaccurate approximation for small |x|,
so returning atanh*() early for y == 0 and |x| <= 1 breaks not only
optimality of the above approximation for small |z|, but also its
accuracy.

I made a similar real function call to atan() for x == 0 (only
implemented in float precision, and the equivalent for cacos and
casinh() not tried).  Now atanl() is not missing, and atan*(x) is not
inaccurate for small x, so calling this early only breaks the
optimality of the above.

To preserve the optimality, I had to put most of the new special cases
later in the function instead of earlier as planned.  This makes them
less good for avoiding special settings of inexact.  Setting inexact
early is also bad for optimality, so I no longer try to do it.  See the
next mail.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 08:25:13 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 32125106566B
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 08:25:13 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from fallbackmx07.syd.optusnet.com.au
	(fallbackmx07.syd.optusnet.com.au [211.29.132.9])
	by mx1.freebsd.org (Postfix) with ESMTP id A56878FC0C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 08:25:11 +0000 (UTC)
Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au
	[211.29.132.186])
	by fallbackmx07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8L8P4KH031489
	for <freebsd-numerics@FreeBSD.org>; Fri, 21 Sep 2012 18:25:04 +1000
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8L8OtMR014157
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 21 Sep 2012 18:24:56 +1000
Date: Fri, 21 Sep 2012 18:24:55 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505BD9B4.8020801@missouri.edu>
Message-ID: <20120921172402.W945@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <50511B40.3070009@missouri.edu>
	<20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu>
	<5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
	<5057F24B.7020605@missouri.edu> <20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 08:25:13 -0000

On Thu, 20 Sep 2012, Stephen Montgomery-Smith wrote:

> I also added inexact optimizations for casinh and cacos.

I couldn't get this to give the hoped-for optimization and dropped
it even for catanh.  You may still prefer it because it is simpler.

But later I found intricacies for returning the correct value of
Pi/2 which make the inexact optimizations even less useful:

The real functions are careful to a fault to return Pi/2 correctly
rounded in all rounding modes.  They don't use return a constant Pi/2,
but evaluate Pi/2 at runtime using pio2_hi + pio2_lo, where pio2_hi
is (or should be) Pi/2 rounded _down_ and pio2_lo is an approximation
to the residual and is volatile enough for the addition to be done at
runtime.  The following shortcuts lose this care:

- similarly, but with pio2_hi = Pi/2 rounded up.  Now pio2_hi + pio2_lo
   is 1 ulp too high when the rounding mode is either up or towards
   plus infinity.  Rounding of Pi/2 to nearest may go either way.
   fdlibm code seems to be careful to round it down in all cases.  In
   in FreeBSD libm, at least e_acosf.c is careful to round down when
   the natural rounding is up, but at invtrig.c is not careful -- it
   apparently uses natural rounding, which happens to be up for ld80
   and down for ld128, or vice versa.
- similary, but with pio2_hi rounded to nearest and 'tiny' used instead
   of pio2_lo.  Using 'tiny' requires pio2_hi to be nearest and only
   works in some rounding modes.
- similarly, but with just m_pi_2 = Pi/2 rounded to nearest.  Now there
   is no runtime evaluation, so the result cannot depend on the rounding
   mode and inexact must be set in some other way.

A quick test of most functions in all rounding mode shows that non-default
modes work quite well except for the most complicated and/or heavily
optimized functions when they are written in C (the totally failing ones
are sin/cos/tan/exp*/pow/hypot but not log* (except for log*(1)) or most
inverse functions.  Optimizations in sin/cos/tan/exp2 require rounding
to nearest).  My tests weren't non-quick enough to detect any 1-ulp
errors for Pi/2, and only showed that the errors mostly don't blow up for
inverse functions.

In view of this, I'd like to keep doing the Pi/2 intricacies.
Partial diffs for catrig.c:

% 48a49,50
% > #define	pio2_hi		m_pi_2		/* works because m_pi_2 rounded down */
% > pio2_lo =		6.1232339957367660e-17,	/* 0x3C91A626, 0x33145C07 */
% 384,389c390,407
% 335c341
% <  * cacos(z) = PI/2 - z + O(|z|^3)   as z -> 0
% ---
% >  * cacos(z) = PI/2 - z + O(z^3)   as z -> 0

This should be PI/2 + O(z) when only the constant term is used, but I
restored use of the z term.

Start changing O(|n|) to O(n).  The absolute value should be implicit.

% 384,389c390,407
% < ...
% < 	if (ax < DBL_EPSILON/8 && ay < SQRT_6_EPSILON/8)
% < 		return (cpack(m_pi_2, -y));
% ---
% > 	if (ax < SQRT_EPSILON && ay < SQRT_EPSILON)
% > 		/*
% > 		 * This is quite subtle.  The expression for PI/2 - x
% > 		 * is cloned from e_acos.c, where it is apparently over-
% > 		 * designed to work in all rounding modes.  It requires
% > 		 * pio2_hi to be rounded down even when rounding to
% > 		 * nearest would be more accurate.  We can't add `tiny'
% > 		 * to pio2_hi as usual to raise inexact, since this would
% > 		 * break the fussy rounding in some non-default modes.
% > 		 * So we use the same method to raise inexact as for the
% > 		 * approximation 'z'.  e_acos.c uses the even subtler
% > 		 * method of depending on inexactness in a higher-degree
% > 		 * approximation.  That is not practical here, since if
% > 		 * we used the x**3 term then we would need an extra
% > 		 * case to avoid spurious underflow.
% > 		 */
% > 		if ((int)ax == 0 && (int)ay == 0)
% > 			return (cpack(pio2_hi - (x - pio2_lo), -y));

Despite being too verbose (BTW, don't commit my essays :-), the comment
neglects to point out that with the expression written in this form,
inexact must be set separately since (x - pio2_lo) might be a value
(e.g., 0) that doesn't give inexactness when subtracted.

All other returns of Pi/2 are simpler than this, and should return
pio2_hi + pio2_lo.  The constants should be spelled like this, and
not using M_PI or m_pi_2; this is especially important in long double
precision since then the constants are declared/defined with this
spelling in extern constant tables in invtrig.c to centralize the
complications for defining them them for all combinations of ld80/
ld128/i386.  So my patch for this is simplest for long double
precision -- there it uses invtrig.h and doesn't worry about the
known bug that pio2_hi is incorrectly rounded in some cases.

With these intricacies, there is less to be gained by setting inexact
up front.  Adding pio2_lo sequentially is slightly slower than an
up-front setting in parallel, but when both are done the up-front
setting just adds overhead on average.  Some of the optimizations
could be done more globably:
- an option to not support nonstandard rounding modes for Pi/2.  This
   seems to require pio2_lo to be a static const, unlike in invtrig.*.
   Make this non-volatile.  The compiler will then evaluate
   pio2_hi + pio2_lo at compile time.
- an option to not support careful setting of inexact.  The above
   gives it for Pi/2.  Settings of it using (1 + tiny) == 1 would
   work similarly -- make `tiny' a static nonvolatile const.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 11:34:30 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 320ED106564A
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 11:34:30 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au
	[211.29.132.186])
	by mx1.freebsd.org (Postfix) with ESMTP id A67BF8FC0C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 11:34:29 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8LBYNp4011985
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 21 Sep 2012 21:34:26 +1000
Date: Fri, 21 Sep 2012 21:34:18 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
In-Reply-To: <20120921172402.W945@besplex.bde.org>
Message-ID: <20120921212525.W1732@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu>
	<5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
	<5057F24B.7020605@missouri.edu> <20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu> <20120921172402.W945@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Stephen Montgomery-Smith <stephen@missouri.edu>,
	freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 11:34:30 -0000

On Fri, 21 Sep 2012, I wrote:

> On Thu, 20 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> I also added inexact optimizations for casinh and cacos.
>
> I couldn't get this to give the hoped-for optimization and dropped
> it even for catanh.  You may still prefer it because it is simpler.

It is giving the hoped-for optimizations now...

> But later I found intricacies for returning the correct value of
> Pi/2 which make the inexact optimizations even less useful:

... for the real parts of cacosf(), casin*f(), but not for the real
part of cacoshf().  I tested mainly the latter and catanhf() before,
and the change is still giving a small pessimization for cacoshf().
(I haven't tested the new version for catanhf() yet, and won't test
in so much detail in other precisions).  I think this is because for
cacosh*() alone, inexact is set in more cases while calculating Pi/2.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 14:07:14 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 05923106564A
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 14:07:14 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id B4A028FC0A
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 14:07:13 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8LE7CbE036295; Fri, 21 Sep 2012 09:07:12 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505C7490.90600@missouri.edu>
Date: Fri, 21 Sep 2012 09:07:12 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <5051F59C.6000603@missouri.edu>
	<20120914014208.I2862@besplex.bde.org>
	<50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
In-Reply-To: <20120921212525.W1732@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 14:07:14 -0000

I will keep the inexact raising up-front (mostly because I forgot how I 
did it earlier).

I will still use atanh(fl), and rely on someone else to fix it.  (If it 
is inexact near 0, it is only a few ULP, and that is good enough for me.)

I'll go ahead and see about the pio2h and pio2l.

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 19:05:14 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C4E40106566B
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 19:05:14 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail03.syd.optusnet.com.au (mail03.syd.optusnet.com.au
	[211.29.132.184])
	by mx1.freebsd.org (Postfix) with ESMTP id 4B3068FC0C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 19:05:14 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail03.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8LJ5APD011591
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 22 Sep 2012 05:05:12 +1000
Date: Sat, 22 Sep 2012 05:05:10 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505C7490.90600@missouri.edu>
Message-ID: <20120922042112.E3044@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu>
	<5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
	<5057F24B.7020605@missouri.edu> <20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu> <20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org> <505C7490.90600@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 19:05:15 -0000

On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:

> I will keep the inexact raising up-front (mostly because I forgot how I did 
> it earlier).

It is working very well.  I have minor cleanups to it using a 1-line
raise_inexact() macro.

> I will still use atanh(fl), and rely on someone else to fix it.  (If it is 
> inexact near 0, it is only a few ULP, and that is good enough for me.)

I see you simplified the SQRT_[34]_EPSILON thresholds and fixed their
initialization by removing them (the function calls in the static
initializers didn't compile).  I spent more time on them and found the
best values in practice (take the sqrt's accurately and divide them by
2 or 4 instead of your 8), but they are painful to initialize, especiallu
for long doubles.  A few points of more general interest turned up while
debugging this,
- atan()'s series is alternating, while the others are not.  Alternation
   causes more cancelation errors.
- FOO_EPSILON only applies to 1 side of an addition.  E.g., it applies
   to 1.0 + x where x > 0, for 1.0 + x where x < 0 the size of the
   corresponding epsilon is half as much.  Non-alternation means that
   the FOO_EPSILON side applies.
- the general approximation in cacos(z) and casin(z) is quite good for
   small z, so larger thresholds for using the special approximations
   don't affect accuracy much.  However, the general approximation in
   catanh(z) is not so good for small z, so using larger threshold for
   the special approximation affects accuracy significantly.  I found
   the approximate best point to switch the approximations.
- some combination of the previous 3 points means that the switching
   point is about twice as large relative to the SQRT_N_EPSILON threshold
   for catan() as for the others (divide by 2 instead of 4).

> I'll go ahead and see about the pio2h and pio2l.

Please wait for my patch for this.  It has all the details for all
precisions including pretty-printing the declarations.  Or you
can do a nearly-global substition of m_pi_2 by pio2_hi + pio2_lo.

My other mostly-complete changes:
- avoid all the scalb() and related calls.  This makes do_hard_work()
   a bit faster and simpler and real_value_reciprocal() much faster and
   a bit more complex.
- make real_value_reciprocal() handle signs (everything is automatic
   except for x = inf), and avoid a copysign() after it
- a few improvements in comments

My other unfinished changes:
- figure out if the up-front things in catanh() are best placed there.
- decide whether to handle pure real and pure imaginary args specially
   like I do for both in catanhf().  This interacts with the previous
   point.
- decide whether my old change to remove unnecessary accuracy for the
   case where ax == 1, ay < FLT_EPSILON in catanh() is correct (you
   didn't accept it, and maybe other accuracy changes make it extra
   accuracy more interesting).

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 19:25:10 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 24E111065672
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 19:25:10 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id D29378FC0C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 19:25:09 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8LJP89D058321; Fri, 21 Sep 2012 14:25:08 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505CBF14.70908@missouri.edu>
Date: Fri, 21 Sep 2012 14:25:08 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
In-Reply-To: <20120922042112.E3044@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 19:25:10 -0000

On 09/21/2012 02:05 PM, Bruce Evans wrote:
> On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:

> - decide whether my old change to remove unnecessary accuracy for the
>    case where ax == 1, ay < FLT_EPSILON in catanh() is correct (you
>    didn't accept it, and maybe other accuracy changes make it extra
>    accuracy more interesting).

Or maybe I missed it.


I did put in the pio2_hi etc stuff in before this email telling me to 
hold off.

I assume you still want pio2_hi etc stuff in catanh.  There is it still 
m_pi_2.  I was thinking
poi2_hi + (tiny + pio2_lo)
or maybe declaring pio2_lo as volatile and using
pio2_hi + pio2_lo

This last week I was very busy and I had to put this project off a 
while.  But now I think things are slowing down again.


From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 19:33:48 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 4FD78106564A
	for <freebsd-numerics@freebsd.org>;
	Fri, 21 Sep 2012 19:33:48 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 0977A8FC14
	for <freebsd-numerics@freebsd.org>;
	Fri, 21 Sep 2012 19:33:47 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8LJXkMG058878 for <freebsd-numerics@freebsd.org>;
	Fri, 21 Sep 2012 14:33:47 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505CC11A.5030502@missouri.edu>
Date: Fri, 21 Sep 2012 14:33:46 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: freebsd-numerics@freebsd.org
References: <5017111E.6060003@missouri.edu> <50526050.2070303@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu>
In-Reply-To: <505CBF14.70908@missouri.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 19:33:48 -0000

On 09/21/2012 02:25 PM, Stephen Montgomery-Smith wrote:
> On 09/21/2012 02:05 PM, Bruce Evans wrote:
>> On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> - decide whether my old change to remove unnecessary accuracy for the
>>    case where ax == 1, ay < FLT_EPSILON in catanh() is correct (you
>>    didn't accept it, and maybe other accuracy changes make it extra
>>    accuracy more interesting).
>
> Or maybe I missed it.

When you send me changes to catrigf.c, I translate it to catrig.c (the 
double version), and then convert it back to catrigf.c.  So sometimes I 
miss things.


From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 22:16:03 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id E55CF106566C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 22:16:03 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au
	[211.29.132.190])
	by mx1.freebsd.org (Postfix) with ESMTP id 5AD2A8FC0C
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 22:16:02 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8LMFrJ7010417
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 22 Sep 2012 08:15:55 +1000
Date: Sat, 22 Sep 2012 08:15:53 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505CBF14.70908@missouri.edu>
Message-ID: <20120922080942.U3613@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120914212403.H1983@besplex.bde.org>
	<50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org> <505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu> <20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 22:16:04 -0000

On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:

> I did put in the pio2_hi etc stuff in before this email telling me to hold 
> off.
>
> I assume you still want pio2_hi etc stuff in catanh.  There is it still 
> m_pi_2.  I was thinking
> poi2_hi + (tiny + pio2_lo)
> or maybe declaring pio2_lo as volatile and using
> pio2_hi + pio2_lo

I have the latter.

m_pi_2 (better pio2) could be #defined as (pio2_hi + pio2_lo), but I
want to avoid this obfuscation.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 23:14:08 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 396BF106566B
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 23:14:08 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail28.syd.optusnet.com.au (mail28.syd.optusnet.com.au
	[211.29.133.169])
	by mx1.freebsd.org (Postfix) with ESMTP id 278C38FC08
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 23:14:06 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail28.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8LNDuYQ025170
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 22 Sep 2012 09:13:57 +1000
Date: Sat, 22 Sep 2012 09:13:56 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505CC11A.5030502@missouri.edu>
Message-ID: <20120922081607.F3613@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <50538E28.6050400@missouri.edu>
	<20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu>
	<5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
	<5057F24B.7020605@missouri.edu> <20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu> <20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org> <505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org> <505CBF14.70908@missouri.edu>
	<505CC11A.5030502@missouri.edu>
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="0-46617504-1348269236=:3613"
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 23:14:08 -0000

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-46617504-1348269236=:3613
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:

> On 09/21/2012 02:25 PM, Stephen Montgomery-Smith wrote:
>> On 09/21/2012 02:05 PM, Bruce Evans wrote:
>>> On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:
>> 
>>> - decide whether my old change to remove unnecessary accuracy for the
>>>    case where ax == 1, ay < FLT_EPSILON in catanh() is correct (you
>>>    didn't accept it, and maybe other accuracy changes make it extra
>>>    accuracy more interesting).
>> 
>> Or maybe I missed it.
>
> When you send me changes to catrigf.c, I translate it to catrig.c (the double 
> version), and then convert it back to catrigf.c.  So sometimes I miss things.

This time I merged everything into catrig.c and even ran the conversion
scripts to check this.  I keep forgetting to add or remove f suffixes
when manually converting.

Patches tomorrow.  Well, the main new one now, for all 3 files since
part of it has lots of magic numbers which are not handled by the
conversion scripts.

% diff -u2 catrig.c~ catrig.c
% --- catrig.c~	2012-09-21 15:51:00.000000000 +0000
% +++ catrig.c	2012-09-21 21:40:34.521926000 +0000
% @@ -206,6 +217,6 @@
%  		 */
%  		*B_is_usable = 0;
% -		*sqrt_A2my2 = scalbn(A, DBL_MANT_DIG);
% -		*new_y = scalbn(y, DBL_MANT_DIG);
% +		*sqrt_A2my2 = A * (2 / DBL_EPSILON);
% +		*new_y= y * (2 / DBL_EPSILON);
%  		return;
%  	}
% @@ -244,6 +255,7 @@
%  			 * scaling should avoid any underflow problems.
%  			 */
% -			*sqrt_A2my2 = scalbn(x, 2*DBL_MANT_DIG) * y / sqrt((y+1)*(y-1));
% -			*new_y = scalbn(y, 2*DBL_MANT_DIG);
% +			*sqrt_A2my2 = x * (4/DBL_EPSILON/DBL_EPSILON) * y /
% +			    sqrt((y+1)*(y-1));
% +			*new_y = y * (4/DBL_EPSILON/DBL_EPSILON);
%  		} else /* if (y < 1) */ {
%  			/*

It's easy to eliminiate these scalbn()s, since the values are constant.

scalbn() is a builtin in gcc-4.2 but not in gcc-3.3, and in 4.2 the
builtin just calls the extern function.  Here the constant values
could be calculated at compile time, but gcc doesn't do this.  I
think clang does.

The conversion script handles this fine.

% @@ -501,29 +519,40 @@
%  /*
%   * real_part_reciprocal(x, y) = Re(1/(x+I*y)) = x/(x*x + y*y).
% - * Assumes x and y are positive or zero, and one of x and y is larger than
% + * Assumes x and y are not NaN, and one of x and y is larger than
%   * RECIP_EPSILON.  We avoid unwarranted underflow.  It is important to not use

The old version was passed positive x and y, but didn't depend on this.
The caller then had to fix up the sign.

This version is passed x and y with their original signs.  The sign is
handled automatically by expressions in the function, and the caller
doesn't fix it up.

%   * the code creal(1/z), because the imaginary part may produce an unwanted
%   * underflow.
% + * This is only called in a context where inexact is always raised before
% + * the call, so no effort is made to avoid or force inexact.
%   */
%  inline static double
%  real_part_reciprocal(double x, double y)
%  {
% +	double scale;
% +	uint32_t hx, hy;
% +	int32_t ix, iy;
% +
%  	/*
%  	 * This code is inspired by the C99 document n1124.pdf, Section G.5.1,
%  	 * example 2.
%  	 */
% -	int ex, ey;
% -
% -	if (isinf(x) || isinf(y))
% -		return (0);
% -	if (y == 0) return (1/x);
% -	if (x == 0) return (x/y/y);
% -	ex = ilogb(x);
% -	ey = ilogb(y);
% -	if (ex - ey >= DBL_MANT_DIG) return (1/x);
% -	if (ey - ex >= DBL_MANT_DIG) return (x/y/y);
% -	x = scalbn(x, -ex);
% -	y = scalbn(y, -ex);
% -	return scalbn(x/(x*x + y*y), -ex);

The conversion to not use scalbn() is fairly direct and routine, but also
fairly magic.

% +	GET_HIGH_WORD(hx, x);
% +	ix = hx & 0x7ff00000;
% +	GET_HIGH_WORD(hy, y);
% +	iy = hy & 0x7ff00000;

ilogb() is a builtin to much the same extent as scalbn() IIRC -- mostly
it isn't.

By working with the raw exponent, we avoid complications from the following
design bugs in ilogb():
- ilogb(0) returns FP_ILOGB0, so the above needs special cases for x == 0
   and y == 0
- ilogb(+-Inf) returns INT_MAX, so the above needs to handle infs earlier
   than is optimal.
With the raw exponents, you can just subtract them and most things work.
Denormals cause problems with this subtraction in some contexts, and
ilogb() has to do a lot of work find their exponent, and scalbn has to
do a lot of work to shift their mantissa (compared with just adding to
the exponent for a normal).  Here we handle them fairly subtly without
any extra code:
     when one arg is denormal, the absolute value exceeds RECIP_EPSILON,
     so there is a large exponent differences, and the special cases
     for large exponent differences handle this case automatically.
The case of y infinite but x finite is handled similarly.

% +#define	BIAS	(DBL_MAX_EXP - 1)
% +/* XXX more guard digits are useful iff there is extra precision. */

Without extra precision, a cutoff of with fewer guard digits somehow
gives better accuracy than one with more.  (The old cutoffs in
terms of exponent bits give ~DBL_MANT_DIG/2 active bits and
~DBL_MANT_DIG/2 guard bits.)

% +#define	CUTOFF	(DBL_MANT_DIG / 2 + 1)	/* just half or 1 guard digit */
% +	if (ix - iy >= CUTOFF << 20 || isinf(x))
% +		return (1/x);		/* +-Inf -> +-0 is special */

Constants are shifted to avoid shifting the exponent bits in ix and iy
back and forth.

The special cases for infinities have been reduced to this one here.
The sign used to be handled by copysign(0, x) when x is +-Inf.  Now
the common 1/x return is used.

% +	if (iy - ix >= CUTOFF << 20)
% +		return (x/y/y);		/* should avoid double div, but hard */
% +	if (ix <= (BIAS + DBL_MAX_EXP / 2 - CUTOFF) << 20)
% +		return (x/(x*x + y*y));
% +	scale = 0;
% +	SET_HIGH_WORD(scale, 0x7ff00000 - ix);	/* 2**(1-ilogb(x)) */
% +	x *= scale;
% +	y *= scale;
% +	return (x/(x*x + y*y) * scale);
%  }
% 
% @@ -577,13 +606,22 @@
% 
%  	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
% -		return (cpack(copysign(real_part_reciprocal(ax, ay), x), copysign(m_pi_2, y)));
% +		return (cpack(real_part_reciprocal(x, y), copysign(pio2_hi + pio2_lo, y)));

Handle the sign in the function.

Unrelated details that I couldn't edit out without breaking the patch
hunk:

% 
% -	if (ax < SQRT_EPSILON && ay < SQRT_EPSILON)
% +	if (ax < SQRT_3_EPSILON/2 && ay < SQRT_3_EPSILON/2) {
% +		/*
% +		 * z = 0 was filtered out above.  All other cases must raise
% +		 * inexact, but this is the only only that needs to do it
% +		 * explicitly.
% +		 */
% +		raise_inexact();
%  		return (z);
% +	}

It is an optimization to only raise inexact here.  The early return for
y == 0 && ax <= 1 worked well, but not the early raising of inexact.
I also reduce the x == 0 case to atanf() early.  That leaves no z == 0
case here, so we raise inexact unconditionally.

% 
%  	if (ax == 1 && ay < DBL_EPSILON) {
% +#if 0 /* this only improves accuracy in an already relative accurate case */
%  		if (ay > 2*DBL_MIN)
%  			rx = - log(ay/2) / 2;
%  		else
% +#endif

This was the change that you might have missed.

%  			rx = - (log(ay) - m_ln2) / 2;
%  	} else

Everything for the other files is routine except for the magic numbers
in real_part_reciprocal related to the packing of the bits.  I prefer
to leave those as magic.  A full macroization of them would have to
macroize the accesses GET_HIGH_WORD() etc.

% diff -u2 catrigf.c~ catrigf.c
% --- catrigf.c~	2012-09-21 15:51:16.000000000 +0000
% +++ catrigf.c	2012-09-21 21:34:41.140231000 +0000
% @@ -108,6 +109,6 @@
%  	if (y < FOUR_SQRT_MIN) {
%  		*B_is_usable = 0;
% -		*sqrt_A2my2 = scalbnf(A, FLT_MANT_DIG);
% -		*new_y = scalbnf(y, FLT_MANT_DIG);
% +		*sqrt_A2my2 = A * (2 / FLT_EPSILON);
% +		*new_y= y * (2 / FLT_EPSILON);
%  		return;
%  	}
% @@ -124,6 +125,7 @@
%  			*sqrt_A2my2 = sqrtf(Amy*(A+y));
%  		} else if (y > 1) {
% -			*sqrt_A2my2 = scalbnf(x, 2*FLT_MANT_DIG) * y / sqrtf((y+1)*(y-1));
% -			*new_y = scalbnf(y, 2*FLT_MANT_DIG);
% +			*sqrt_A2my2 = x * (4/FLT_EPSILON/FLT_EPSILON) * y /
% +			    sqrtf((y+1)*(y-1));
% +			*new_y = y * (4/FLT_EPSILON/FLT_EPSILON);
%  		} else {
%  			*sqrt_A2my2 = sqrtf((1-y)*(1+y));
% @@ -293,17 +299,24 @@
%  real_part_reciprocal(float x, float y)
%  {
% -	int ex, ey;
% -
% -	if (isinf(x) || isinf(y))
% -		return (0);
% -	if (y == 0) return (1/x);
% -	if (x == 0) return (x/y/y);
% -	ex = ilogbf(x);
% -	ey = ilogbf(y);
% -	if (ex - ey >= FLT_MANT_DIG) return (1/x);
% -	if (ey - ex >= FLT_MANT_DIG) return (x/y/y);
% -	x = scalbnf(x, -ex);
% -	y = scalbnf(y, -ex);
% -	return scalbnf(x/(x*x + y*y), -ex);
% +	float scale;
% +	uint32_t hx, hy;
% +	int32_t ix, iy;
% +
% +	GET_FLOAT_WORD(hx, x);
% +	ix = hx & 0x7f800000;
% +	GET_FLOAT_WORD(hy, y);
% +	iy = hy & 0x7f800000;
% +#define	BIAS	(FLT_MAX_EXP - 1)
% +#define	CUTOFF	(FLT_MANT_DIG / 2 + 1)
% +	if (ix - iy >= CUTOFF << 23 || isinf(x))
% +		return (1/x);
% +	if (iy - ix >= CUTOFF << 23)
% +		return (x/y/y);
% +	if (ix <= (BIAS + FLT_MAX_EXP / 2 - CUTOFF) << 23)
% +		return (x/(x*x + y*y));
% +	SET_FLOAT_WORD(scale, 0x7f800000 - ix);
% +	x *= scale;
% +	y *= scale;
% +	return (x/(x*x + y*y) * scale);
%  }
% 
% @@ -335,13 +348,17 @@
% 
%  	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
% -		return (cpackf(copysignf(real_part_reciprocal(ax, ay), x), copysignf(m_pi_2, y)));
% +		return (cpackf(real_part_reciprocal(x, y), copysignf(pio2_hi + pio2_lo, y)));
% 
% -	if (ax < SQRT_EPSILON && ay < SQRT_EPSILON)
% +	if (ax < SQRT_3_EPSILON/2 && ay < SQRT_3_EPSILON/2) {
% +		raise_inexact();
%  		return (z);
% +	}
% 
%  	if (ax == 1 && ay < FLT_EPSILON) {
% +#if 0
%  		if (ay > 2*FLT_MIN)
%  			rx = - logf(ay/2) / 2;
%  		else
% +#endif
%  			rx = - (logf(ay) - m_ln2) / 2;
%  	} else
% diff -u2 catrigl.c~ catrigl.c
% --- catrigl.c~	2012-09-21 16:22:40.000000000 +0000
% +++ catrigl.c	2012-09-21 21:17:46.962698000 +0000
% @@ -122,6 +124,6 @@
%  	if (y < FOUR_SQRT_MIN) {
%  		*B_is_usable = 0;
% -		*sqrt_A2my2 = scalbnl(A, LDBL_MANT_DIG);
% -		*new_y = scalbnl(y, LDBL_MANT_DIG);
% +		*sqrt_A2my2 = A * (2 / LDBL_EPSILON);
% +		*new_y= y * (2 / LDBL_EPSILON);
%  		return;
%  	}
% @@ -138,6 +140,7 @@
%  			*sqrt_A2my2 = sqrtl(Amy*(A+y));
%  		} else if (y > 1) {
% -			*sqrt_A2my2 = scalbnl(x, 2*LDBL_MANT_DIG) * y / sqrtl((y+1)*(y-1));
% -			*new_y = scalbnl(y, 2*LDBL_MANT_DIG);
% +			*sqrt_A2my2 = x * (4/LDBL_EPSILON/LDBL_EPSILON) * y /
% +			    sqrtl((y+1)*(y-1));
% +			*new_y = y * (4/LDBL_EPSILON/LDBL_EPSILON);
%  		} else {
%  			*sqrt_A2my2 = sqrtl((1-y)*(1+y));
% @@ -307,17 +314,24 @@
%  real_part_reciprocal(long double x, long double y)
%  {
% -	int ex, ey;
% -
% -	if (isinf(x) || isinf(y))
% -		return (0);
% -	if (y == 0) return (1/x);
% -	if (x == 0) return (x/y/y);
% -	ex = ilogbl(x);
% -	ey = ilogbl(y);
% -	if (ex - ey >= LDBL_MANT_DIG) return (1/x);
% -	if (ey - ex >= LDBL_MANT_DIG) return (x/y/y);
% -	x = scalbnl(x, -ex);
% -	y = scalbnl(y, -ex);
% -	return scalbnl(x/(x*x + y*y), -ex);
% +	long double scale;
% +	uint16_t hx, hy;
% +	int16_t ix, iy;
% +
% +	GET_LDBL_EXPSIGN(hx, x);
% +	ix = hx & 0x7fff;
% +	GET_LDBL_EXPSIGN(hy, y);
% +	iy = hy & 0x7fff;
% +#define	BIAS	(LDBL_MAX_EXP - 1)
% +#define	CUTOFF	(LDBL_MANT_DIG / 2 + 1)
% +	if (ix - iy >= CUTOFF || isinf(x))
% +		return (1/x);
% +	if (iy - ix >= CUTOFF)
% +		return (x/y/y);
% +	if (ix <= BIAS + LDBL_MAX_EXP / 2 - CUTOFF)
% +		return (x/(x*x + y*y));
% +	SET_LDBL_EXPSIGN(scale, 0x7fff - ix);
% +	x *= scale;
% +	y *= scale;
% +	return (x/(x*x + y*y) * scale);
%  }
% 
% @@ -349,13 +363,17 @@
% 
%  	if (ax > RECIP_EPSILON || ay > RECIP_EPSILON)
% -		return (cpackl(copysignl(real_part_reciprocal(ax, ay), x), copysignl(m_pi_2, y)));
% +		return (cpackl(real_part_reciprocal(x, y), copysignl(pio2_hi + pio2_lo, y)));
% 
% -	if (ax < SQRT_EPSILON && ay < SQRT_EPSILON)
% +	if (ax < SQRT_3_EPSILON/2 && ay < SQRT_3_EPSILON/2) {
% +		raise_inexact();
%  		return (z);
% +	}
% 
%  	if (ax == 1 && ay < LDBL_EPSILON) {
% +#if 0
%  		if (ay > 2*LDBL_MIN)
%  			rx = - logl(ay/2) / 2;
%  		else
% +#endif
%  			rx = - (logl(ay) - m_ln2) / 2;
%  	} else

The patch is also attached.

Bruce
--0-46617504-1348269236=:3613
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="catrig.diff"
Content-Transfer-Encoding: BASE64
Content-ID: <20120922091356.U3613@besplex.bde.org>
Content-Description: 
Content-Disposition: attachment; filename="catrig.diff"

ZGlmZiAtdTIgY2F0cmlnLmN+IGNhdHJpZy5jDQotLS0gY2F0cmlnLmN+CTIw
MTItMDktMjEgMTU6NTE6MDAuMDAwMDAwMDAwICswMDAwDQorKysgY2F0cmln
LmMJMjAxMi0wOS0yMSAyMTo0MDozNC41MjE5MjYwMDAgKzAwMDANCkBAIC0z
NSw0ICszNSw1IEBADQogI3VuZGVmIGlzbmFuDQogI2RlZmluZSBpc25hbih4
KQkoKHgpICE9ICh4KSkNCisjZGVmaW5lCXJhaXNlX2luZXhhY3QoKQlkbyB7
IHZvbGF0aWxlIGludCBqdW5rID0gMSArIHRpbnk7IH0gd2hpbGUoMCkNCiAj
dW5kZWYgc2lnbmJpdA0KICNkZWZpbmUgc2lnbmJpdCh4KQkoX19idWlsdGlu
X3NpZ25iaXQoeCkpDQpAQCAtNDYsMTIgKzQ3LDIyIEBADQogbV9lID0JCQky
LjcxODI4MTgyODQ1OTA0NTJlMCwJLyogIDB4MTViZjBhOGIxNDU3NjkuMHAt
NTEgKi8NCiBtX2xuMiA9CQkJNi45MzE0NzE4MDU1OTk0NTMxZS0xLAkvKiAg
MHgxNjJlNDJmZWZhMzllZi4wcC01MyAqLw0KLW1fcGlfMiA9CQkxLjU3MDc5
NjMyNjc5NDg5NjZlMCwJLyogIDB4MTkyMWZiNTQ0NDJkMTguMHAtNTIgKi8N
Ci1waW8yX2hpID0JCTEuNTcwNzk2MzI2Nzk0ODk2NTU4MDBlKzAwLAkvKiAw
eDNGRjkyMUZCLCAweDU0NDQyRDE4ICovDQotcGlvMl9sbyA9CQk2LjEyMzIz
Mzk5NTczNjc2NjAzNTg3ZS0xNywJLyogMHgzQzkxQTYyNiwgMHgzMzE0NUMw
NyAqLw0KKy8qDQorICogV2Ugbm8gbG9uZ2VyIHVzZSBNX1BJXzIgb3IgbV9w
aV8yLiAgSW4gZmxvYXQgcHJlY2lzaW9uLCByb3VuZGluZyB0bw0KKyAqIG5l
YXJlc3Qgb2YgUEkvMiBoYXBwZW5zIHRvIHJvdW5kIHVwLCBidXQgd2Ugd2Fu
dCByb3VuZGluZyBkb3duIHNvDQorICogdGhhdCB0aGUgZXhwcmVzc2lvbnMg
Zm9yIGFwcHJveGltYXRpbmcgUEkvMiBhbmQgKFBJLzIgLSB6KSB3b3JrIGlu
IGFsbA0KKyAqIHJvdW5kaW5nIG1vZGVzLiAgVGhpcyBpcyBub3QgdmVyeSBp
bXBvcnRhbnQsIGJ1dCBpdCBpcyBuZWNlc3NhcnkgZm9yDQorICogdGhlIHNh
bWUgcXVhbGl0eSBvZiBpbXBsZW1lbnRhdGlvbiB0aGF0IGZkbGlibSBoYWQg
aW4gMTk5MiBhbmQgdGhhdA0KKyAqIHJlYWwgZnVuY3Rpb25zIG1vc3RseSBz
dGlsbCBoYXZlLiAgVGhpcyBpcyBrbm93biB0byBiZSBicm9rZW4gb25seSBp
bg0KKyAqIGxkODAgYWNvc2woKSB2aWEgaW52dHJpZy5jIGFuZCBpbiBzb21l
IGludmFsaWQgb3B0aW1pemF0aW9ucyBpbiBjb2RlDQorICogdW5kZXIgZGV2
ZWxvcG1lbnQsIGFuZCBub3cgaW4gYWxsIGZ1bmN0aW9ucyBpbiBjYXRyaWds
LmMgdmlhIGludnRyaWcuYy4NCisgKi8NCitwaW8yX2hpID0JCTEuNTcwNzk2
MzI2Nzk0ODk2NmUwLAkvKiAgMHgxOTIxZmI1NDQ0MmQxOC4wcC01MiAqLw0K
IFJFQ0lQX0VQU0lMT04gPQkJMS9EQkxfRVBTSUxPTiwNCi1TUVJUX0VQU0lM
T04gPQkJMHgxcC0yNywJCS8qIDw9IHNxcnQoREJMX0VQU0lMT04pICovIA0K
K1NRUlRfM19FUFNJTE9OID0JMi41ODA5NTY4Mjc5NTE3ODQ5ZS04LAkvKiAg
MHgxYmI2N2FlODU4NGNhYS4wcC03OCAqLw0KK1NRUlRfNl9FUFNJTE9OID0J
My42NTAwMjQxNDk5ODg4NTcxZS04LAkvKiAgMHgxMzk4OGUxNDA5MjEyZS4w
cC03NyAqLw0KIFNRUlRfTUlOID0JCTB4MXAtNTExOwkvKiA+PSBzcXJ0KERC
TF9NSU4pICovDQogDQogc3RhdGljIGNvbnN0IHZvbGF0aWxlIGRvdWJsZQ0K
K3BpbzJfbG8gPQkJNi4xMjMyMzM5OTU3MzY3NjU5ZS0xNywJLyogIDB4MTFh
NjI2MzMxNDVjMDcuMHAtMTA2ICovDQogdGlueSA9CQkJMHgxcC0xMDAwOw0K
IA0KQEAgLTIwNiw2ICsyMTcsNiBAQA0KIAkJICovDQogCQkqQl9pc191c2Fi
bGUgPSAwOw0KLQkJKnNxcnRfQTJteTIgPSBzY2FsYm4oQSwgREJMX01BTlRf
RElHKTsNCi0JCSpuZXdfeSA9IHNjYWxibih5LCBEQkxfTUFOVF9ESUcpOw0K
KwkJKnNxcnRfQTJteTIgPSBBICogKDIgLyBEQkxfRVBTSUxPTik7DQorCQkq
bmV3X3k9IHkgKiAoMiAvIERCTF9FUFNJTE9OKTsNCiAJCXJldHVybjsNCiAJ
fQ0KQEAgLTI0NCw2ICsyNTUsNyBAQA0KIAkJCSAqIHNjYWxpbmcgc2hvdWxk
IGF2b2lkIGFueSB1bmRlcmZsb3cgcHJvYmxlbXMuDQogCQkJICovDQotCQkJ
KnNxcnRfQTJteTIgPSBzY2FsYm4oeCwgMipEQkxfTUFOVF9ESUcpICogeSAv
IHNxcnQoKHkrMSkqKHktMSkpOw0KLQkJCSpuZXdfeSA9IHNjYWxibih5LCAy
KkRCTF9NQU5UX0RJRyk7DQorCQkJKnNxcnRfQTJteTIgPSB4ICogKDQvREJM
X0VQU0lMT04vREJMX0VQU0lMT04pICogeSAvDQorCQkJICAgIHNxcnQoKHkr
MSkqKHktMSkpOw0KKwkJCSpuZXdfeSA9IHkgKiAoNC9EQkxfRVBTSUxPTi9E
QkxfRVBTSUxPTik7DQogCQl9IGVsc2UgLyogaWYgKHkgPCAxKSAqLyB7DQog
CQkJLyoNCkBAIC0zMDMsOSArMzE1LDEyIEBADQogCX0NCiANCi0JLyogcmFp
c2UgaW5leGFjdCBpZiB6ICE9IDAuICovDQotCWlmICgoeCA9PSAwICYmIHkg
PT0gMCkgfHwgKGludCkoMSArIHRpbnkpICE9IDEpDQorCS8qIEF2b2lkIHNw
dXJpb3VzbHkgcmFpc2luZyBpbmV4YWN0IGZvciB6ID0gMC4gKi8NCisJaWYg
KHggPT0gMCAmJiB5ID09IDApDQogCQlyZXR1cm4gKHopOw0KIA0KLQlpZiAo
YXggPCBTUVJUX0VQU0lMT04gJiYgYXkgPCBTUVJUX0VQU0lMT04pDQorCS8q
IEFsbCByZW1haW5pbmcgY2FzZXMgYXJlIGluZXhhY3QuICovDQorCXJhaXNl
X2luZXhhY3QoKTsNCisNCisJaWYgKGF4IDwgU1FSVF82X0VQU0lMT04vNCAm
JiBheSA8IFNRUlRfNl9FUFNJTE9OLzQpDQogCQlyZXR1cm4gKHopOw0KIA0K
QEAgLTM2NCw1ICszNzksNSBAQA0KIAkJCXJldHVybiAoY3BhY2soeCt4LCAt
eSkpOw0KIAkJLyogY2Fjb3MoMCArIEkqTmFOKSA9IFBJLzIgKyBJKk5hTiB3
aXRoIGluZXhhY3QgKi8NCi0JCWlmICh4ID09IDApIHJldHVybiAoY3BhY2so
bV9waV8yICsgdGlueSwgeSt5KSk7DQorCQlpZiAoeCA9PSAwKSByZXR1cm4g
KGNwYWNrKHBpbzJfaGkgKyBwaW8yX2xvLCB5K3kpKTsNCiAJCS8qDQogCQkg
KiBBbGwgb3RoZXIgY2FzZXMgaW52b2x2aW5nIE5hTiByZXR1cm4gTmFOICsg
SSpOYU4uDQpAQCAtMzgzLDkgKzM5OCwxMiBAQA0KIAl9DQogDQotCS8qIHJh
aXNlIGluZXhhY3QgaWYgeiAhPSAxLiAqLw0KLQlpZiAoKHggPT0gMSAmJiB5
ID09IDApIHx8IChpbnQpKDEgKyB0aW55KSAhPSAxKQ0KKwkvKiBBdm9pZCBz
cHVyaW91c2x5IHJhaXNpbmcgaW5leGFjdCBmb3IgeiA9IDEuICovDQorCWlm
ICh4ID09IDEgJiYgeSA9PSAwKQ0KIAkJcmV0dXJuIChjcGFjaygwLCAteSkp
Ow0KIA0KLQlpZiAoYXggPCBTUVJUX0VQU0lMT04gJiYgYXkgPCBTUVJUX0VQ
U0lMT04pDQorCS8qIEFsbCByZW1haW5pbmcgY2FzZXMgYXJlIGluZXhhY3Qu
ICovDQorCXJhaXNlX2luZXhhY3QoKTsNCisNCisJaWYgKGF4IDwgU1FSVF82
X0VQU0lMT04vNCAmJiBheSA8IFNRUlRfNl9FUFNJTE9OLzQpDQogCQlyZXR1
cm4gKGNwYWNrKHBpbzJfaGkgLSAoeCAtIHBpbzJfbG8pLCAteSkpOw0KIA0K
QEAgLTUwMSwyOSArNTE5LDQwIEBADQogLyoNCiAgKiByZWFsX3BhcnRfcmVj
aXByb2NhbCh4LCB5KSA9IFJlKDEvKHgrSSp5KSkgPSB4Lyh4KnggKyB5Knkp
Lg0KLSAqIEFzc3VtZXMgeCBhbmQgeSBhcmUgcG9zaXRpdmUgb3IgemVybywg
YW5kIG9uZSBvZiB4IGFuZCB5IGlzIGxhcmdlciB0aGFuDQorICogQXNzdW1l
cyB4IGFuZCB5IGFyZSBub3QgTmFOLCBhbmQgb25lIG9mIHggYW5kIHkgaXMg
bGFyZ2VyIHRoYW4NCiAgKiBSRUNJUF9FUFNJTE9OLiAgV2UgYXZvaWQgdW53
YXJyYW50ZWQgdW5kZXJmbG93LiAgSXQgaXMgaW1wb3J0YW50IHRvIG5vdCB1
c2UNCiAgKiB0aGUgY29kZSBjcmVhbCgxL3opLCBiZWNhdXNlIHRoZSBpbWFn
aW5hcnkgcGFydCBtYXkgcHJvZHVjZSBhbiB1bndhbnRlZA0KICAqIHVuZGVy
Zmxvdy4NCisgKiBUaGlzIGlzIG9ubHkgY2FsbGVkIGluIGEgY29udGV4dCB3
aGVyZSBpbmV4YWN0IGlzIGFsd2F5cyByYWlzZWQgYmVmb3JlDQorICogdGhl
IGNhbGwsIHNvIG5vIGVmZm9ydCBpcyBtYWRlIHRvIGF2b2lkIG9yIGZvcmNl
IGluZXhhY3QuDQogICovDQogaW5saW5lIHN0YXRpYyBkb3VibGUNCiByZWFs
X3BhcnRfcmVjaXByb2NhbChkb3VibGUgeCwgZG91YmxlIHkpDQogew0KKwlk
b3VibGUgc2NhbGU7DQorCXVpbnQzMl90IGh4LCBoeTsNCisJaW50MzJfdCBp
eCwgaXk7DQorDQogCS8qDQogCSAqIFRoaXMgY29kZSBpcyBpbnNwaXJlZCBi
eSB0aGUgQzk5IGRvY3VtZW50IG4xMTI0LnBkZiwgU2VjdGlvbiBHLjUuMSwN
CiAJICogZXhhbXBsZSAyLg0KIAkgKi8NCi0JaW50IGV4LCBleTsNCi0NCi0J
aWYgKGlzaW5mKHgpIHx8IGlzaW5mKHkpKQ0KLQkJcmV0dXJuICgwKTsNCi0J
aWYgKHkgPT0gMCkgcmV0dXJuICgxL3gpOw0KLQlpZiAoeCA9PSAwKSByZXR1
cm4gKHgveS95KTsNCi0JZXggPSBpbG9nYih4KTsNCi0JZXkgPSBpbG9nYih5
KTsNCi0JaWYgKGV4IC0gZXkgPj0gREJMX01BTlRfRElHKSByZXR1cm4gKDEv
eCk7DQotCWlmIChleSAtIGV4ID49IERCTF9NQU5UX0RJRykgcmV0dXJuICh4
L3kveSk7DQotCXggPSBzY2FsYm4oeCwgLWV4KTsNCi0JeSA9IHNjYWxibih5
LCAtZXgpOw0KLQlyZXR1cm4gc2NhbGJuKHgvKHgqeCArIHkqeSksIC1leCk7
DQorCUdFVF9ISUdIX1dPUkQoaHgsIHgpOw0KKwlpeCA9IGh4ICYgMHg3ZmYw
MDAwMDsNCisJR0VUX0hJR0hfV09SRChoeSwgeSk7DQorCWl5ID0gaHkgJiAw
eDdmZjAwMDAwOw0KKyNkZWZpbmUJQklBUwkoREJMX01BWF9FWFAgLSAxKQ0K
Ky8qIFhYWCBtb3JlIGd1YXJkIGRpZ2l0cyBhcmUgdXNlZnVsIGlmZiB0aGVy
ZSBpcyBleHRyYSBwcmVjaXNpb24uICovDQorI2RlZmluZQlDVVRPRkYJKERC
TF9NQU5UX0RJRyAvIDIgKyAxKQkvKiBqdXN0IGhhbGYgb3IgMSBndWFyZCBk
aWdpdCAqLw0KKwlpZiAoaXggLSBpeSA+PSBDVVRPRkYgPDwgMjAgfHwgaXNp
bmYoeCkpDQorCQlyZXR1cm4gKDEveCk7CQkvKiArLUluZiAtPiArLTAgaXMg
c3BlY2lhbCAqLw0KKwlpZiAoaXkgLSBpeCA+PSBDVVRPRkYgPDwgMjApDQor
CQlyZXR1cm4gKHgveS95KTsJCS8qIHNob3VsZCBhdm9pZCBkb3VibGUgZGl2
LCBidXQgaGFyZCAqLw0KKwlpZiAoaXggPD0gKEJJQVMgKyBEQkxfTUFYX0VY
UCAvIDIgLSBDVVRPRkYpIDw8IDIwKQ0KKwkJcmV0dXJuICh4Lyh4KnggKyB5
KnkpKTsNCisJc2NhbGUgPSAwOw0KKwlTRVRfSElHSF9XT1JEKHNjYWxlLCAw
eDdmZjAwMDAwIC0gaXgpOwkvKiAyKiooMS1pbG9nYih4KSkgKi8NCisJeCAq
PSBzY2FsZTsNCisJeSAqPSBzY2FsZTsNCisJcmV0dXJuICh4Lyh4KnggKyB5
KnkpICogc2NhbGUpOw0KIH0NCiANCkBAIC01NTQsNyArNTgzLDcgQEANCiAJ
CXJldHVybiAoY3BhY2soYXRhbmgoeCksIHkpKTsgDQogDQotCS8qIHJhaXNl
IGluZXhhY3QgaWYgeiAhPSAwLiAqLw0KLQlpZiAoKHggPT0gMCAmJiB5ID09
IDApIHx8IChpbnQpKDEgKyB0aW55KSAhPSAxKQ0KLQkJcmV0dXJuICh6KTsN
CisJLyogVG8gZW5zdXJlIHRoZSBzYW1lIGFjY3VyYWN5IGFzIGF0YW4oKSwg
YW5kIHRvIGZpbHRlciBvdXQgeiA9IDAuICovDQorCWlmICh4ID09IDApDQor
CQlyZXR1cm4gKGNwYWNrKHgsIGF0YW4oeSkpKTsNCiANCiAJaWYgKGlzbmFu
KHgpIHx8IGlzbmFuKHkpKSB7DQpAQCAtNTY0LDUgKzU5Myw1IEBADQogCQkv
KiBjYXRhbmgoTmFOICsgSSorLUluZikgPSBzaWduKE5hTikwICsgSSorLVBJ
LzIgKi8NCiAJCWlmIChpc2luZih5KSkNCi0JCQlyZXR1cm4gKGNwYWNrKGNv
cHlzaWduKDAsIHgpLCBjb3B5c2lnbihtX3BpXzIsIHkpKSk7DQorCQkJcmV0
dXJuIChjcGFjayhjb3B5c2lnbigwLCB4KSwgY29weXNpZ24ocGlvMl9oaSAr
IHBpbzJfbG8sIHkpKSk7DQogCQkvKiBjYXRhbmgoKy0wICsgSSpOYU4pID0g
Ky0wICsgSSpOYU4gKi8NCiAJCWlmICh4ID09IDApDQpAQCAtNTc3LDEzICs2
MDYsMjIgQEANCiANCiAJaWYgKGF4ID4gUkVDSVBfRVBTSUxPTiB8fCBheSA+
IFJFQ0lQX0VQU0lMT04pDQotCQlyZXR1cm4gKGNwYWNrKGNvcHlzaWduKHJl
YWxfcGFydF9yZWNpcHJvY2FsKGF4LCBheSksIHgpLCBjb3B5c2lnbihtX3Bp
XzIsIHkpKSk7DQorCQlyZXR1cm4gKGNwYWNrKHJlYWxfcGFydF9yZWNpcHJv
Y2FsKHgsIHkpLCBjb3B5c2lnbihwaW8yX2hpICsgcGlvMl9sbywgeSkpKTsN
CiANCi0JaWYgKGF4IDwgU1FSVF9FUFNJTE9OICYmIGF5IDwgU1FSVF9FUFNJ
TE9OKQ0KKwlpZiAoYXggPCBTUVJUXzNfRVBTSUxPTi8yICYmIGF5IDwgU1FS
VF8zX0VQU0lMT04vMikgew0KKwkJLyoNCisJCSAqIHogPSAwIHdhcyBmaWx0
ZXJlZCBvdXQgYWJvdmUuICBBbGwgb3RoZXIgY2FzZXMgbXVzdCByYWlzZQ0K
KwkJICogaW5leGFjdCwgYnV0IHRoaXMgaXMgdGhlIG9ubHkgb25seSB0aGF0
IG5lZWRzIHRvIGRvIGl0DQorCQkgKiBleHBsaWNpdGx5Lg0KKwkJICovDQor
CQlyYWlzZV9pbmV4YWN0KCk7DQogCQlyZXR1cm4gKHopOw0KKwl9DQogDQog
CWlmIChheCA9PSAxICYmIGF5IDwgREJMX0VQU0lMT04pIHsNCisjaWYgMCAv
KiB0aGlzIG9ubHkgaW1wcm92ZXMgYWNjdXJhY3kgaW4gYW4gYWxyZWFkeSBy
ZWxhdGl2ZSBhY2N1cmF0ZSBjYXNlICovDQogCQlpZiAoYXkgPiAyKkRCTF9N
SU4pDQogCQkJcnggPSAtIGxvZyhheS8yKSAvIDI7DQogCQllbHNlDQorI2Vu
ZGlmDQogCQkJcnggPSAtIChsb2coYXkpIC0gbV9sbjIpIC8gMjsNCiAJfSBl
bHNlDQpAQCAtNTkyLDUgKzYzMCw1IEBADQogCWlmIChheCA9PSAxKQ0KIAkJ
cnkgPSBhdGFuMigyLCAtYXkpIC8gMjsNCi0JZWxzZSBpZiAoYXkgPCBGT1VS
X1NRUlRfTUlOKQ0KKwllbHNlIGlmIChheSA8IERCTF9FUFNJTE9OKQ0KIAkJ
cnkgPSBhdGFuMigyKmF5LCAoMS1heCkqKDErYXgpKSAvIDI7DQogCWVsc2UN
CmRpZmYgLXUyIGNhdHJpZ2YuY34gY2F0cmlnZi5jDQotLS0gY2F0cmlnZi5j
fgkyMDEyLTA5LTIxIDE1OjUxOjE2LjAwMDAwMDAwMCArMDAwMA0KKysrIGNh
dHJpZ2YuYwkyMDEyLTA5LTIxIDIxOjM0OjQxLjE0MDIzMTAwMCArMDAwMA0K
QEAgLTQ1LDQgKzQ1LDUgQEANCiAjdW5kZWYgaXNuYW4NCiAjZGVmaW5lIGlz
bmFuKHgpCSgoeCkgIT0gKHgpKQ0KKyNkZWZpbmUJcmFpc2VfaW5leGFjdCgp
CWRvIHsgdm9sYXRpbGUgaW50IGp1bmsgPSAxICsgdGlueTsgfSB3aGlsZSgw
KQ0KICN1bmRlZiBzaWduYml0DQogI2RlZmluZSBzaWduYml0KHgpCShfX2J1
aWx0aW5fc2lnbmJpdGYoeCkpDQpAQCAtNTUsMTIgKzU2LDEyIEBADQogbV9l
ID0JCQkyLjcxODI4MTgyODVlMCwJCS8qICAweGFkZjg1NC4wcC0yMiAqLw0K
IG1fbG4yID0JCQk2LjkzMTQ3MTgwNTZlLTEsCS8qICAweGIxNzIxOC4wcC0y
NCAqLw0KLW1fcGlfMiA9CQkxLjU3MDc5NjMyNjhlMCwJCS8qICAweGM5MGZk
Yi4wcC0yMyAqLw0KLXBpbzJfaGkgPQkJMS41NzA3OTYyNTEzZSswMCwJLyog
MHgzZmM5MGZkYSAqLw0KLXBpbzJfbG8gPQkJNy41NDk3ODk0MTU5ZS0wOCwJ
LyogMHgzM2EyMjE2OCAqLw0KK3BpbzJfaGkgPQkJMS41NzA3OTYyNTEzZTAs
CQkvKiAgMHhjOTBmZGEuMHAtMjMgKi8NCiBSRUNJUF9FUFNJTE9OID0JCTEv
RkxUX0VQU0lMT04sDQotU1FSVF9FUFNJTE9OID0JCTIwNDggKiBGTFRfRVBT
SUxPTiwNCitTUVJUXzNfRVBTSUxPTiA9CTUuOTgwMTk5NTY3M2UtNCwJLyog
IDB4OWNjNDcxLjBwLTM0ICovDQorU1FSVF82X0VQU0lMT04gPQk4LjQ1NzI3
OTMzMzhlLTQsCS8qICAweGRkYjNkNy4wcC0zNCAqLw0KIFNRUlRfTUlOID0J
CTB4MXAtNjM7DQogDQogc3RhdGljIGNvbnN0IHZvbGF0aWxlIGZsb2F0DQor
cGlvMl9sbyA9CQk3LjU0OTc4OTk1NDllLTgsCS8qICAweGEyMjE2OS4wcC00
NyAqLw0KIHRpbnkgPQkJCTB4MXAtMTAwOw0KIA0KQEAgLTEwOCw2ICsxMDks
NiBAQA0KIAlpZiAoeSA8IEZPVVJfU1FSVF9NSU4pIHsNCiAJCSpCX2lzX3Vz
YWJsZSA9IDA7DQotCQkqc3FydF9BMm15MiA9IHNjYWxibmYoQSwgRkxUX01B
TlRfRElHKTsNCi0JCSpuZXdfeSA9IHNjYWxibmYoeSwgRkxUX01BTlRfRElH
KTsNCisJCSpzcXJ0X0EybXkyID0gQSAqICgyIC8gRkxUX0VQU0lMT04pOw0K
KwkJKm5ld195PSB5ICogKDIgLyBGTFRfRVBTSUxPTik7DQogCQlyZXR1cm47
DQogCX0NCkBAIC0xMjQsNiArMTI1LDcgQEANCiAJCQkqc3FydF9BMm15MiA9
IHNxcnRmKEFteSooQSt5KSk7DQogCQl9IGVsc2UgaWYgKHkgPiAxKSB7DQot
CQkJKnNxcnRfQTJteTIgPSBzY2FsYm5mKHgsIDIqRkxUX01BTlRfRElHKSAq
IHkgLyBzcXJ0ZigoeSsxKSooeS0xKSk7DQotCQkJKm5ld195ID0gc2NhbGJu
Zih5LCAyKkZMVF9NQU5UX0RJRyk7DQorCQkJKnNxcnRfQTJteTIgPSB4ICog
KDQvRkxUX0VQU0lMT04vRkxUX0VQU0lMT04pICogeSAvDQorCQkJICAgIHNx
cnRmKCh5KzEpKih5LTEpKTsNCisJCQkqbmV3X3kgPSB5ICogKDQvRkxUX0VQ
U0lMT04vRkxUX0VQU0lMT04pOw0KIAkJfSBlbHNlIHsNCiAJCQkqc3FydF9B
Mm15MiA9IHNxcnRmKCgxLXkpKigxK3kpKTsNCkBAIC0xNjEsOCArMTYzLDEw
IEBADQogCX0NCiANCi0JaWYgKCh4ID09IDAgJiYgeSA9PSAwKSB8fCAoaW50
KSgxICsgdGlueSkgIT0gMSkNCisJaWYgKHggPT0gMCAmJiB5ID09IDApDQog
CQlyZXR1cm4gKHopOw0KIA0KLQlpZiAoYXggPCBTUVJUX0VQU0lMT04gJiYg
YXkgPCBTUVJUX0VQU0lMT04pDQorCXJhaXNlX2luZXhhY3QoKTsNCisNCisJ
aWYgKGF4IDwgU1FSVF82X0VQU0lMT04vNCAmJiBheSA8IFNRUlRfNl9FUFNJ
TE9OLzQpDQogCQlyZXR1cm4gKHopOw0KIA0KQEAgLTIwMiw1ICsyMDYsNSBA
QA0KIAkJaWYgKGlzaW5mKHkpKQ0KIAkJCXJldHVybiAoY3BhY2tmKHgreCwg
LXkpKTsNCi0JCWlmICh4ID09IDApIHJldHVybiAoY3BhY2tmKG1fcGlfMiAr
IHRpbnksIHkreSkpOw0KKwkJaWYgKHggPT0gMCkgcmV0dXJuIChjcGFja2Yo
cGlvMl9oaSArIHBpbzJfbG8sIHkreSkpOw0KIAkJcmV0dXJuIChjcGFja2Yo
eCswLjBMKyh5KzApLCB4KzAuMEwrKHkrMCkpKTsNCiAJfQ0KQEAgLTIxNSw4
ICsyMTksMTAgQEANCiAJfQ0KIA0KLQlpZiAoKHggPT0gMSAmJiB5ID09IDAp
IHx8IChpbnQpKDEgKyB0aW55KSAhPSAxKQ0KKwlpZiAoeCA9PSAxICYmIHkg
PT0gMCkNCiAJCXJldHVybiAoY3BhY2tmKDAsIC15KSk7DQogDQotCWlmIChh
eCA8IFNRUlRfRVBTSUxPTiAmJiBheSA8IFNRUlRfRVBTSUxPTikNCisJcmFp
c2VfaW5leGFjdCgpOw0KKw0KKwlpZiAoYXggPCBTUVJUXzZfRVBTSUxPTi80
ICYmIGF5IDwgU1FSVF82X0VQU0lMT04vNCkNCiAJCXJldHVybiAoY3BhY2tm
KHBpbzJfaGkgLSAoeCAtIHBpbzJfbG8pLCAteSkpOw0KIA0KQEAgLTI5Mywx
NyArMjk5LDI0IEBADQogcmVhbF9wYXJ0X3JlY2lwcm9jYWwoZmxvYXQgeCwg
ZmxvYXQgeSkNCiB7DQotCWludCBleCwgZXk7DQotDQotCWlmIChpc2luZih4
KSB8fCBpc2luZih5KSkNCi0JCXJldHVybiAoMCk7DQotCWlmICh5ID09IDAp
IHJldHVybiAoMS94KTsNCi0JaWYgKHggPT0gMCkgcmV0dXJuICh4L3kveSk7
DQotCWV4ID0gaWxvZ2JmKHgpOw0KLQlleSA9IGlsb2diZih5KTsNCi0JaWYg
KGV4IC0gZXkgPj0gRkxUX01BTlRfRElHKSByZXR1cm4gKDEveCk7DQotCWlm
IChleSAtIGV4ID49IEZMVF9NQU5UX0RJRykgcmV0dXJuICh4L3kveSk7DQot
CXggPSBzY2FsYm5mKHgsIC1leCk7DQotCXkgPSBzY2FsYm5mKHksIC1leCk7
DQotCXJldHVybiBzY2FsYm5mKHgvKHgqeCArIHkqeSksIC1leCk7DQorCWZs
b2F0IHNjYWxlOw0KKwl1aW50MzJfdCBoeCwgaHk7DQorCWludDMyX3QgaXgs
IGl5Ow0KKw0KKwlHRVRfRkxPQVRfV09SRChoeCwgeCk7DQorCWl4ID0gaHgg
JiAweDdmODAwMDAwOw0KKwlHRVRfRkxPQVRfV09SRChoeSwgeSk7DQorCWl5
ID0gaHkgJiAweDdmODAwMDAwOw0KKyNkZWZpbmUJQklBUwkoRkxUX01BWF9F
WFAgLSAxKQ0KKyNkZWZpbmUJQ1VUT0ZGCShGTFRfTUFOVF9ESUcgLyAyICsg
MSkNCisJaWYgKGl4IC0gaXkgPj0gQ1VUT0ZGIDw8IDIzIHx8IGlzaW5mKHgp
KQ0KKwkJcmV0dXJuICgxL3gpOw0KKwlpZiAoaXkgLSBpeCA+PSBDVVRPRkYg
PDwgMjMpDQorCQlyZXR1cm4gKHgveS95KTsNCisJaWYgKGl4IDw9IChCSUFT
ICsgRkxUX01BWF9FWFAgLyAyIC0gQ1VUT0ZGKSA8PCAyMykNCisJCXJldHVy
biAoeC8oeCp4ICsgeSp5KSk7DQorCVNFVF9GTE9BVF9XT1JEKHNjYWxlLCAw
eDdmODAwMDAwIC0gaXgpOw0KKwl4ICo9IHNjYWxlOw0KKwl5ICo9IHNjYWxl
Ow0KKwlyZXR1cm4gKHgvKHgqeCArIHkqeSkgKiBzY2FsZSk7DQogfQ0KIA0K
QEAgLTMyMSw2ICszMzQsNiBAQA0KIAkJcmV0dXJuIChjcGFja2YoYXRhbmhm
KHgpLCB5KSk7IA0KIA0KLQlpZiAoKHggPT0gMCAmJiB5ID09IDApIHx8IChp
bnQpKDEgKyB0aW55KSAhPSAxKQ0KLQkJcmV0dXJuICh6KTsNCisJaWYgKHgg
PT0gMCkNCisJCXJldHVybiAoY3BhY2tmKHgsIGF0YW5mKHkpKSk7DQogDQog
CWlmIChpc25hbih4KSB8fCBpc25hbih5KSkgew0KQEAgLTMyOCw1ICszNDEs
NSBAQA0KIAkJCXJldHVybiAoY3BhY2tmKGNvcHlzaWduZigwLCB4KSwgeSt5
KSk7DQogCQlpZiAoaXNpbmYoeSkpDQotCQkJcmV0dXJuIChjcGFja2YoY29w
eXNpZ25mKDAsIHgpLCBjb3B5c2lnbmYobV9waV8yLCB5KSkpOw0KKwkJCXJl
dHVybiAoY3BhY2tmKGNvcHlzaWduZigwLCB4KSwgY29weXNpZ25mKHBpbzJf
aGkgKyBwaW8yX2xvLCB5KSkpOw0KIAkJaWYgKHggPT0gMCkNCiAJCQlyZXR1
cm4gKGNwYWNrZih4LCB5K3kpKTsNCkBAIC0zMzUsMTMgKzM0OCwxNyBAQA0K
IA0KIAlpZiAoYXggPiBSRUNJUF9FUFNJTE9OIHx8IGF5ID4gUkVDSVBfRVBT
SUxPTikNCi0JCXJldHVybiAoY3BhY2tmKGNvcHlzaWduZihyZWFsX3BhcnRf
cmVjaXByb2NhbChheCwgYXkpLCB4KSwgY29weXNpZ25mKG1fcGlfMiwgeSkp
KTsNCisJCXJldHVybiAoY3BhY2tmKHJlYWxfcGFydF9yZWNpcHJvY2FsKHgs
IHkpLCBjb3B5c2lnbmYocGlvMl9oaSArIHBpbzJfbG8sIHkpKSk7DQogDQot
CWlmIChheCA8IFNRUlRfRVBTSUxPTiAmJiBheSA8IFNRUlRfRVBTSUxPTikN
CisJaWYgKGF4IDwgU1FSVF8zX0VQU0lMT04vMiAmJiBheSA8IFNRUlRfM19F
UFNJTE9OLzIpIHsNCisJCXJhaXNlX2luZXhhY3QoKTsNCiAJCXJldHVybiAo
eik7DQorCX0NCiANCiAJaWYgKGF4ID09IDEgJiYgYXkgPCBGTFRfRVBTSUxP
Tikgew0KKyNpZiAwDQogCQlpZiAoYXkgPiAyKkZMVF9NSU4pDQogCQkJcngg
PSAtIGxvZ2YoYXkvMikgLyAyOw0KIAkJZWxzZQ0KKyNlbmRpZg0KIAkJCXJ4
ID0gLSAobG9nZihheSkgLSBtX2xuMikgLyAyOw0KIAl9IGVsc2UNCkBAIC0z
NTAsNSArMzY3LDUgQEANCiAJaWYgKGF4ID09IDEpDQogCQlyeSA9IGF0YW4y
ZigyLCAtYXkpIC8gMjsNCi0JZWxzZSBpZiAoYXkgPCBGT1VSX1NRUlRfTUlO
KQ0KKwllbHNlIGlmIChheSA8IEZMVF9FUFNJTE9OKQ0KIAkJcnkgPSBhdGFu
MmYoMipheSwgKDEtYXgpKigxK2F4KSkgLyAyOw0KIAllbHNlDQpkaWZmIC11
MiBjYXRyaWdsLmN+IGNhdHJpZ2wuYw0KLS0tIGNhdHJpZ2wuY34JMjAxMi0w
OS0yMSAxNjoyMjo0MC4wMDAwMDAwMDAgKzAwMDANCisrKyBjYXRyaWdsLmMJ
MjAxMi0wOS0yMSAyMToxNzo0Ni45NjI2OTgwMDAgKzAwMDANCkBAIC0zOCw1
ICszOCw0IEBADQogI2luY2x1ZGUgPGZsb2F0Lmg+DQogDQotI2luY2x1ZGUg
ImZwbWF0aC5oIg0KICNpbmNsdWRlICJpbnZ0cmlnLmgiDQogI2luY2x1ZGUg
Im1hdGguaCINCkBAIC00Nyw0ICs0Niw1IEBADQogI3VuZGVmIGlzbmFuDQog
I2RlZmluZSBpc25hbih4KQkoKHgpICE9ICh4KSkNCisjZGVmaW5lCXJhaXNl
X2luZXhhY3QoKQlkbyB7IHZvbGF0aWxlIGludCBqdW5rID0gMSArIHRpbnk7
IH0gd2hpbGUoMCkNCiAjdW5kZWYgc2lnbmJpdA0KICNkZWZpbmUgc2lnbmJp
dCh4KQkoX19idWlsdGluX3NpZ25iaXRsKHgpKSANCkBAIC01Niw1ICs1Niw0
IEBADQogUVVBUlRFUl9TUVJUX01BWCA9CTB4MXA4MTg5TCwNCiBSRUNJUF9F
UFNJTE9OID0JCTEvTERCTF9FUFNJTE9OLA0KLVNRUlRfRVBTSUxPTiA9CQkx
RS0xMEwsDQogU1FSVF9NSU4gPQkJMHgxcC04MTkxTDsNCiANCkBAIC02Miwx
NCArNjEsMTcgQEANCiBzdGF0aWMgY29uc3QgdW5pb24gSUVFRWwyYml0cw0K
IHVtX2UgPQkJTEQ4MEMoMHhhZGY4NTQ1OGEyYmI0YTliLCAgMSwgMCwgMi43
MTgyODE4Mjg0NTkwNDUyMzUzNmUwTCksDQotdW1fbG4yID0JTEQ4MEMoMHhi
MTcyMTdmN2QxY2Y3OWFjLCAtMSwgMCwgNi45MzE0NzE4MDU1OTk0NTMwOTQx
N2UtMUwpLA0KLXVtX3BpXzIgPQlMRDgwQygweGM5MGZkYWEyMjE2OGMyMzUs
ICAwLCAwLCAxLjU3MDc5NjMyNjc5NDg5NjYxOTIzZTBMKTsNCit1bV9sbjIg
PQlMRDgwQygweGIxNzIxN2Y3ZDFjZjc5YWMsIC0xLCAwLCA2LjkzMTQ3MTgw
NTU5OTQ1MzA5NDE3ZS0xTCk7DQogI2RlZmluZQkJbV9lCXVtX2UuZQ0KICNk
ZWZpbmUJCW1fbG4yCXVtX2xuMi5lDQotI2RlZmluZQkJbV9waV8yCXVtX3Bp
XzIuZQ0KK3N0YXRpYyBjb25zdCBsb25nIGRvdWJsZQ0KKy8qIFRoZSBuZXh0
IDIgbGl0ZXJhbHMgZm9yIG5vbi1pMzg2LiAgTWlzcm91bmRpbmcgdGhlbSBv
biBpMzg2IGlzIGhhcm1sZXNzLiAqLw0KK1NRUlRfM19FUFNJTE9OID0gNS43
MDMxNjI3MzQzNTc1ODkxNTMxMGUtMTAsCS8qICAweDljYzQ3MGEwNDkwOTcz
ZTguMHAtOTQgKi8NCitTUVJUXzZfRVBTSUxPTiA9IDguMDY1NDkwMDg3MzQ5
MzI3NzE2NjRlLTEwOwkvKiAgMHhkZGIzZDc0MmMyNjU1MzllLjBwLTk0ICov
DQogI2VsaWYgTERCTF9NQU5UX0RJRyA9PSAxMTMNCiBzdGF0aWMgY29uc3Qg
bG9uZyBkb3VibGUNCiBtX2UgPQkJMi43MTgyODE4Mjg0NTkwNDUyMzUzNjAy
ODc0NzEzNTI2NjI1MGUwTCwJLyogMHgxNWJmMGE4YjE0NTc2OTUzNTVmYjhh
YzQwNGU3YS4wcC0xMTEgKi8NCiBtX2xuMiA9CQk2LjkzMTQ3MTgwNTU5OTQ1
MzA5NDE3MjMyMTIxNDU4MTc2NTY4ZS0xTCwJLyogMHgxNjJlNDJmZWZhMzll
ZjM1NzkzYzc2NzMwMDdlNi4wcC0xMTMgKi8NCi1tX3BpXzIgPQkxLjU3MDc5
NjMyNjc5NDg5NjYxOTIzMTMyMTY5MTYzOTc1MTQ0ZTBMOwkvKiAweDE5MjFm
YjU0NDQyZDE4NDY5ODk4Y2M1MTcwMWI4LjBwLTExMiAqLw0KK1NRUlRfM19F
UFNJTE9OID0gMi40MDM3MDMzNTc5Nzk0NTQ5MDk3NTMzNjcyNzE5OTg3ODEy
NGUtMTcsCS8qICAweDFiYjY3YWU4NTg0Y2FhNzNiMjU3NDJkNzA3OGI4LjBw
LTE2OCAqLw0KK1NRUlRfNl9FUFNJTE9OID0gMy4zOTkzNDk4ODg3NzYyOTU4
NzIzOTA4MjU4NjIyMzMwMDM5MWUtMTc7CS8qICAweDEzOTg4ZTE0MDkyMTJl
N2QwMzIxOTE0MzIxYTU1LjBwLTE2NyAqLw0KICNlbHNlDQogI2Vycm9yICJV
bnN1cHBvcnRlZCBsb25nIGRvdWJsZSBmb3JtYXQiDQpAQCAtMTIyLDYgKzEy
NCw2IEBADQogCWlmICh5IDwgRk9VUl9TUVJUX01JTikgew0KIAkJKkJfaXNf
dXNhYmxlID0gMDsNCi0JCSpzcXJ0X0EybXkyID0gc2NhbGJubChBLCBMREJM
X01BTlRfRElHKTsNCi0JCSpuZXdfeSA9IHNjYWxibmwoeSwgTERCTF9NQU5U
X0RJRyk7DQorCQkqc3FydF9BMm15MiA9IEEgKiAoMiAvIExEQkxfRVBTSUxP
Tik7DQorCQkqbmV3X3k9IHkgKiAoMiAvIExEQkxfRVBTSUxPTik7DQogCQly
ZXR1cm47DQogCX0NCkBAIC0xMzgsNiArMTQwLDcgQEANCiAJCQkqc3FydF9B
Mm15MiA9IHNxcnRsKEFteSooQSt5KSk7DQogCQl9IGVsc2UgaWYgKHkgPiAx
KSB7DQotCQkJKnNxcnRfQTJteTIgPSBzY2FsYm5sKHgsIDIqTERCTF9NQU5U
X0RJRykgKiB5IC8gc3FydGwoKHkrMSkqKHktMSkpOw0KLQkJCSpuZXdfeSA9
IHNjYWxibmwoeSwgMipMREJMX01BTlRfRElHKTsNCisJCQkqc3FydF9BMm15
MiA9IHggKiAoNC9MREJMX0VQU0lMT04vTERCTF9FUFNJTE9OKSAqIHkgLw0K
KwkJCSAgICBzcXJ0bCgoeSsxKSooeS0xKSk7DQorCQkJKm5ld195ID0geSAq
ICg0L0xEQkxfRVBTSUxPTi9MREJMX0VQU0lMT04pOw0KIAkJfSBlbHNlIHsN
CiAJCQkqc3FydF9BMm15MiA9IHNxcnRsKCgxLXkpKigxK3kpKTsNCkBAIC0x
NzUsOCArMTc4LDEwIEBADQogCX0NCiANCi0JaWYgKCh4ID09IDAgJiYgeSA9
PSAwKSB8fCAoaW50KSgxICsgdGlueSkgIT0gMSkNCisJaWYgKHggPT0gMCAm
JiB5ID09IDApDQogCQlyZXR1cm4gKHopOw0KIA0KLQlpZiAoYXggPCBTUVJU
X0VQU0lMT04gJiYgYXkgPCBTUVJUX0VQU0lMT04pDQorCXJhaXNlX2luZXhh
Y3QoKTsNCisNCisJaWYgKGF4IDwgU1FSVF82X0VQU0lMT04vNCAmJiBheSA8
IFNRUlRfNl9FUFNJTE9OLzQpDQogCQlyZXR1cm4gKHopOw0KIA0KQEAgLTIx
Niw1ICsyMjEsNSBAQA0KIAkJaWYgKGlzaW5mKHkpKQ0KIAkJCXJldHVybiAo
Y3BhY2tsKHgreCwgLXkpKTsNCi0JCWlmICh4ID09IDApIHJldHVybiAoY3Bh
Y2tsKG1fcGlfMiArIHRpbnksIHkreSkpOw0KKwkJaWYgKHggPT0gMCkgcmV0
dXJuIChjcGFja2wocGlvMl9oaSArIHBpbzJfbG8sIHkreSkpOw0KIAkJcmV0
dXJuIChjcGFja2woeCswLjBMKyh5KzApLCB4KzAuMEwrKHkrMCkpKTsNCiAJ
fQ0KQEAgLTIyOSw4ICsyMzQsMTAgQEANCiAJfQ0KIA0KLQlpZiAoKHggPT0g
MSAmJiB5ID09IDApIHx8IChpbnQpKDEgKyB0aW55KSAhPSAxKQ0KKwlpZiAo
eCA9PSAxICYmIHkgPT0gMCkNCiAJCXJldHVybiAoY3BhY2tsKDAsIC15KSk7
DQogDQotCWlmIChheCA8IFNRUlRfRVBTSUxPTiAmJiBheSA8IFNRUlRfRVBT
SUxPTikNCisJcmFpc2VfaW5leGFjdCgpOw0KKw0KKwlpZiAoYXggPCBTUVJU
XzZfRVBTSUxPTi80ICYmIGF5IDwgU1FSVF82X0VQU0lMT04vNCkNCiAJCXJl
dHVybiAoY3BhY2tsKHBpbzJfaGkgLSAoeCAtIHBpbzJfbG8pLCAteSkpOw0K
IA0KQEAgLTMwNywxNyArMzE0LDI0IEBADQogcmVhbF9wYXJ0X3JlY2lwcm9j
YWwobG9uZyBkb3VibGUgeCwgbG9uZyBkb3VibGUgeSkNCiB7DQotCWludCBl
eCwgZXk7DQotDQotCWlmIChpc2luZih4KSB8fCBpc2luZih5KSkNCi0JCXJl
dHVybiAoMCk7DQotCWlmICh5ID09IDApIHJldHVybiAoMS94KTsNCi0JaWYg
KHggPT0gMCkgcmV0dXJuICh4L3kveSk7DQotCWV4ID0gaWxvZ2JsKHgpOw0K
LQlleSA9IGlsb2dibCh5KTsNCi0JaWYgKGV4IC0gZXkgPj0gTERCTF9NQU5U
X0RJRykgcmV0dXJuICgxL3gpOw0KLQlpZiAoZXkgLSBleCA+PSBMREJMX01B
TlRfRElHKSByZXR1cm4gKHgveS95KTsNCi0JeCA9IHNjYWxibmwoeCwgLWV4
KTsNCi0JeSA9IHNjYWxibmwoeSwgLWV4KTsNCi0JcmV0dXJuIHNjYWxibmwo
eC8oeCp4ICsgeSp5KSwgLWV4KTsNCisJbG9uZyBkb3VibGUgc2NhbGU7DQor
CXVpbnQxNl90IGh4LCBoeTsNCisJaW50MTZfdCBpeCwgaXk7DQorDQorCUdF
VF9MREJMX0VYUFNJR04oaHgsIHgpOw0KKwlpeCA9IGh4ICYgMHg3ZmZmOw0K
KwlHRVRfTERCTF9FWFBTSUdOKGh5LCB5KTsNCisJaXkgPSBoeSAmIDB4N2Zm
ZjsNCisjZGVmaW5lCUJJQVMJKExEQkxfTUFYX0VYUCAtIDEpDQorI2RlZmlu
ZQlDVVRPRkYJKExEQkxfTUFOVF9ESUcgLyAyICsgMSkNCisJaWYgKGl4IC0g
aXkgPj0gQ1VUT0ZGIHx8IGlzaW5mKHgpKQ0KKwkJcmV0dXJuICgxL3gpOw0K
KwlpZiAoaXkgLSBpeCA+PSBDVVRPRkYpDQorCQlyZXR1cm4gKHgveS95KTsN
CisJaWYgKGl4IDw9IEJJQVMgKyBMREJMX01BWF9FWFAgLyAyIC0gQ1VUT0ZG
KQ0KKwkJcmV0dXJuICh4Lyh4KnggKyB5KnkpKTsNCisJU0VUX0xEQkxfRVhQ
U0lHTihzY2FsZSwgMHg3ZmZmIC0gaXgpOw0KKwl4ICo9IHNjYWxlOw0KKwl5
ICo9IHNjYWxlOw0KKwlyZXR1cm4gKHgvKHgqeCArIHkqeSkgKiBzY2FsZSk7
DQogfQ0KIA0KQEAgLTMzMyw4ICszNDcsOCBAQA0KIA0KIAlpZiAoeSA9PSAw
ICYmIGF4IDw9IDEpDQotCQlyZXR1cm4gKGNwYWNrbChhdGFuaGwoeCksIHkp
KTsgDQorCQlyZXR1cm4gKGNwYWNrbChhdGFuaCh4KSwgeSkpOyAJLyogWFhY
IG5lZWQgYXRhbmhsKCkgKi8NCiANCi0JaWYgKCh4ID09IDAgJiYgeSA9PSAw
KSB8fCAoaW50KSgxICsgdGlueSkgIT0gMSkNCi0JCXJldHVybiAoeik7DQor
CWlmICh4ID09IDApDQorCQlyZXR1cm4gKGNwYWNrbCh4LCBhdGFubCh5KSkp
Ow0KIA0KIAlpZiAoaXNuYW4oeCkgfHwgaXNuYW4oeSkpIHsNCkBAIC0zNDIs
NSArMzU2LDUgQEANCiAJCQlyZXR1cm4gKGNwYWNrbChjb3B5c2lnbmwoMCwg
eCksIHkreSkpOw0KIAkJaWYgKGlzaW5mKHkpKQ0KLQkJCXJldHVybiAoY3Bh
Y2tsKGNvcHlzaWdubCgwLCB4KSwgY29weXNpZ25sKG1fcGlfMiwgeSkpKTsN
CisJCQlyZXR1cm4gKGNwYWNrbChjb3B5c2lnbmwoMCwgeCksIGNvcHlzaWdu
bChwaW8yX2hpICsgcGlvMl9sbywgeSkpKTsNCiAJCWlmICh4ID09IDApDQog
CQkJcmV0dXJuIChjcGFja2woeCwgeSt5KSk7DQpAQCAtMzQ5LDEzICszNjMs
MTcgQEANCiANCiAJaWYgKGF4ID4gUkVDSVBfRVBTSUxPTiB8fCBheSA+IFJF
Q0lQX0VQU0lMT04pDQotCQlyZXR1cm4gKGNwYWNrbChjb3B5c2lnbmwocmVh
bF9wYXJ0X3JlY2lwcm9jYWwoYXgsIGF5KSwgeCksIGNvcHlzaWdubChtX3Bp
XzIsIHkpKSk7DQorCQlyZXR1cm4gKGNwYWNrbChyZWFsX3BhcnRfcmVjaXBy
b2NhbCh4LCB5KSwgY29weXNpZ25sKHBpbzJfaGkgKyBwaW8yX2xvLCB5KSkp
Ow0KIA0KLQlpZiAoYXggPCBTUVJUX0VQU0lMT04gJiYgYXkgPCBTUVJUX0VQ
U0lMT04pDQorCWlmIChheCA8IFNRUlRfM19FUFNJTE9OLzIgJiYgYXkgPCBT
UVJUXzNfRVBTSUxPTi8yKSB7DQorCQlyYWlzZV9pbmV4YWN0KCk7DQogCQly
ZXR1cm4gKHopOw0KKwl9DQogDQogCWlmIChheCA9PSAxICYmIGF5IDwgTERC
TF9FUFNJTE9OKSB7DQorI2lmIDANCiAJCWlmIChheSA+IDIqTERCTF9NSU4p
DQogCQkJcnggPSAtIGxvZ2woYXkvMikgLyAyOw0KIAkJZWxzZQ0KKyNlbmRp
Zg0KIAkJCXJ4ID0gLSAobG9nbChheSkgLSBtX2xuMikgLyAyOw0KIAl9IGVs
c2UNCkBAIC0zNjQsNSArMzgyLDUgQEANCiAJaWYgKGF4ID09IDEpDQogCQly
eSA9IGF0YW4ybCgyLCAtYXkpIC8gMjsNCi0JZWxzZSBpZiAoYXkgPCBGT1VS
X1NRUlRfTUlOKQ0KKwllbHNlIGlmIChheSA8IExEQkxfRVBTSUxPTikNCiAJ
CXJ5ID0gYXRhbjJsKDIqYXksICgxLWF4KSooMStheCkpIC8gMjsNCiAJZWxz
ZQ0K

--0-46617504-1348269236=:3613--

From owner-freebsd-numerics@FreeBSD.ORG  Fri Sep 21 23:18:58 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BD1DA106566B
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 23:18:58 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail12.syd.optusnet.com.au (mail12.syd.optusnet.com.au
	[211.29.132.193])
	by mx1.freebsd.org (Postfix) with ESMTP id 30EA18FC0A
	for <freebsd-numerics@FreeBSD.org>;
	Fri, 21 Sep 2012 23:18:57 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail12.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8LNIs3a029618
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 22 Sep 2012 09:18:56 +1000
Date: Sat, 22 Sep 2012 09:18:54 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
In-Reply-To: <20120922081607.F3613@besplex.bde.org>
Message-ID: <20120922091625.Y3828@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org> <505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu> <20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Stephen Montgomery-Smith <stephen@missouri.edu>,
	freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2012 23:18:58 -0000

On Sat, 22 Sep 2012, Bruce Evans wrote:

> ...
> Patches tomorrow.  Well, the main new one now, for all 3 files since
> part of it has lots of magic numbers which are not handled by the
> conversion scripts.
> ...
> The patch is also attached.

The attachment was larger than intended.  It had my complete patch set
for catrig*.c.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 01:11:21 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 90DDD106564A
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 01:11:21 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 492C98FC0C
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 01:11:20 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8M1BJoF092260; Fri, 21 Sep 2012 20:11:19 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505D1037.8010202@missouri.edu>
Date: Fri, 21 Sep 2012 20:11:19 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu>
	<20120915231032.C2669@besplex.bde.org>
	<50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu>
	<5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
In-Reply-To: <20120922091625.Y3828@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 01:11:21 -0000

On 09/21/2012 06:18 PM, Bruce Evans wrote:
> On Sat, 22 Sep 2012, Bruce Evans wrote:
>
>> ...
>> Patches tomorrow.  Well, the main new one now, for all 3 files since
>> part of it has lots of magic numbers which are not handled by the
>> conversion scripts.
>> ...
>> The patch is also attached.
>
> The attachment was larger than intended.  It had my complete patch set
> for catrig*.c.
>
> Bruce
>
>

Will there be another complete patch set tomorrow, or did you just send 
it today?

From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 04:28:51 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D81D1106564A
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 04:28:51 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au
	[211.29.132.186])
	by mx1.freebsd.org (Postfix) with ESMTP id 673338FC08
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 04:28:51 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8M4SmGk030895
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 22 Sep 2012 14:28:49 +1000
Date: Sat, 22 Sep 2012 14:28:48 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505D1037.8010202@missouri.edu>
Message-ID: <20120922142349.X4599@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
	<5057F24B.7020605@missouri.edu> <20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu> <20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org> <505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org> <505CBF14.70908@missouri.edu>
	<505CC11A.5030502@missouri.edu> <20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 04:28:52 -0000

On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:

> On 09/21/2012 06:18 PM, Bruce Evans wrote:
>> On Sat, 22 Sep 2012, Bruce Evans wrote:
>> 
>>> ...
>>> Patches tomorrow.  Well, the main new one now, for all 3 files since
>>> part of it has lots of magic numbers which are not handled by the
>>> conversion scripts.
>>> ...
>>> The patch is also attached.
>> 
>> The attachment was larger than intended.  It had my complete patch set
>> for catrig*.c.
>
> Will there be another complete patch set tomorrow, or did you just send it 
> today?

I sent it all and won't change much more for a while.  I might describe it
more tomorrow.  Already made a small change: always use float for `tiny'
(it is now only used in raise_inexact), and in raise_inexact assign
(1 + tiny) to volatile float instead of volatile int.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 05:26:26 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 3FBE2106566B
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 05:26:26 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id DE17D8FC08
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 05:26:25 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8M5QIgb058997; Sat, 22 Sep 2012 00:26:18 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505D4BFA.5050401@missouri.edu>
Date: Sat, 22 Sep 2012 00:26:18 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <5054C200.7090307@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
In-Reply-To: <20120922142349.X4599@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 05:26:26 -0000

On 09/21/2012 11:28 PM, Bruce Evans wrote:
> On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> On 09/21/2012 06:18 PM, Bruce Evans wrote:
>>> On Sat, 22 Sep 2012, Bruce Evans wrote:
>>>
>>>> ...
>>>> Patches tomorrow.  Well, the main new one now, for all 3 files since
>>>> part of it has lots of magic numbers which are not handled by the
>>>> conversion scripts.
>>>> ...
>>>> The patch is also attached.
>>>
>>> The attachment was larger than intended.  It had my complete patch set
>>> for catrig*.c.
>>
>> Will there be another complete patch set tomorrow, or did you just
>> send it today?
>
> I sent it all and won't change much more for a while.  I might describe it
> more tomorrow.

The only change I made was to change atanh to atanhl in catrigl.c, 
seeing that I had written one for myself.


From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 05:41:25 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 76156106566C
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 05:41:25 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 2FF768FC08
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 05:41:24 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8M5fOnK060485 for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 00:41:24 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505D4F84.90005@missouri.edu>
Date: Sat, 22 Sep 2012 00:41:24 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: freebsd-numerics@freebsd.org
References: <5017111E.6060003@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
	<505D4BFA.5050401@missouri.edu>
In-Reply-To: <505D4BFA.5050401@missouri.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 05:41:25 -0000

On 09/22/2012 12:26 AM, Stephen Montgomery-Smith wrote:
> On 09/21/2012 11:28 PM, Bruce Evans wrote:
>> On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:
>>
>>> On 09/21/2012 06:18 PM, Bruce Evans wrote:
>>>> On Sat, 22 Sep 2012, Bruce Evans wrote:
>>>>
>>>>> ...
>>>>> Patches tomorrow.  Well, the main new one now, for all 3 files since
>>>>> part of it has lots of magic numbers which are not handled by the
>>>>> conversion scripts.
>>>>> ...
>>>>> The patch is also attached.
>>>>
>>>> The attachment was larger than intended.  It had my complete patch set
>>>> for catrig*.c.
>>>
>>> Will there be another complete patch set tomorrow, or did you just
>>> send it today?
>>
>> I sent it all and won't change much more for a while.  I might
>> describe it
>> more tomorrow.
>
> The only change I made was to change atanh to atanhl in catrigl.c,
> seeing that I had written one for myself.

I am finding some errors with catrigl.c in real_part_reciprocal.  I 
don't know how SET_LDBL_EXPSIGN is meant to work.  But I needed to add 
the extra statement:

+        scale = 1;
         SET_LDBL_EXPSIGN(scale, 0x7fff - ix);


From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 18:05:47 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 18F9F106566B
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 18:05:47 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail26.syd.optusnet.com.au (mail26.syd.optusnet.com.au
	[211.29.133.167])
	by mx1.freebsd.org (Postfix) with ESMTP id 9BD178FC08
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 18:05:45 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail26.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8MI5bMd009676
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 23 Sep 2012 04:05:38 +1000
Date: Sun, 23 Sep 2012 04:05:37 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505D4F84.90005@missouri.edu>
Message-ID: <20120923030719.E1209@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
	<5057F24B.7020605@missouri.edu> <20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu> <20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org> <505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org> <505CBF14.70908@missouri.edu>
	<505CC11A.5030502@missouri.edu> <20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org> <505D4BFA.5050401@missouri.edu>
	<505D4F84.90005@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 18:05:47 -0000

On Sat, 22 Sep 2012, Stephen Montgomery-Smith wrote:

> I am finding some errors with catrigl.c in real_part_reciprocal.  I don't 
> know how SET_LDBL_EXPSIGN is meant to work.  But I needed to add the extra 
> statement:
>
> +        scale = 1;
>        SET_LDBL_EXPSIGN(scale, 0x7fff - ix);

Good fix.  I forgot that the normalization bit is not part of the exponent
for ld80.  So setting ony the exponent bits gives a pseudo-zero (zero
normalized mantissa and nonzero exponent).  I think pseudo-zeros are
treated as zeros on i387.

Your fix works by setting the normalization bit.  On i387, scale = 1
gives some exponent and sign that won't be used, and and a mantissa
of 0x8000000000000000ULL.  SET_LDBL_EXPSIGN() keeps this mantissa
and overrides the exponent and sign to (0, whatever).

I don't understand why my tests didn't discover this bug.  They only
cover the exponent range of doubles, but that is plenty to reach the
buggy code.

In logl() I spent a lot of time optimizing settings of long doubles
as bits, end ended up using just SET_LDBL_EXPSIGN() to modify a
normal value that didn't need special setting.  Alternative algorithms
that created a special normal value first or set all the mantissa bits
as bits were slower.  The access macros for setting the mantissa bits
weren't even committed.  Many long double functions use direct bit-field
accesses instead.  This is unportable and tends to be slower.

Here is the method used in ld80/s_expl.c for setting 2**k:

@ 	/* Prepare scale factors. */
@ 	v.xbits.man = 1ULL << 63;

This is the non-implicit normalization bit for ld80.  ld128 has implicit
normalization so it uses 0 here.  The macros in _fpmath.h for handling
the normalization bit are poor, and the normalization is known for ld80,
so this just hard-codes the value.

scale = 0 or scale = 1 here tends to be slower, since it asks to set the
sign and exponent bits too.  I used it in catrig to reduce unportabilities
(there is only the expsign access, and there is a macro for that).
Compilers may be able to optimize away the extra setting of the sign
and exponent bits by noticing that they will be overwritten soon, and
when they don't it turns out that setting things twice is often the best
method for confusing compilers into generating optimal memory accesses,
since optimal often doesn't equal least number.

@ 	if (k >= LDBL_MIN_EXP) {
@ 		v.xbits.expsign = BIAS + k;
@ 		twopk = v.e;
@ 	} else {
@ 		v.xbits.expsign = BIAS + k + 10000;
@ 		twopkp10000 = v.e;
@ 	}

This has complications to avoid setting unrepresentable exponent bits for
infinities and denormals.  In catrig, these complications are not at
runtime (the original exponent is large so negating it doesn't ask for
an infinity; negating it might ask for a denormal so 1 is added to
the negation of it to produce the new exponent, and this cannot ask for
an infinity either).

I don't like the direct bit-field accesses in the above although I wrote
them.  Efficiency tests show that these particular bit-field accesses
are optimized well enough on amd64 and i386.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 20:09:13 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 2538E106564A
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 20:09:13 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail08.syd.optusnet.com.au (mail08.syd.optusnet.com.au
	[211.29.132.189])
	by mx1.freebsd.org (Postfix) with ESMTP id 802C08FC0A
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 20:09:11 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail08.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8MK92hu011974
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 23 Sep 2012 06:09:04 +1000
Date: Sun, 23 Sep 2012 06:09:02 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
In-Reply-To: <20120922142349.X4599@besplex.bde.org>
Message-ID: <20120923044814.S1465@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org> <505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu> <20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org> <20120922091625.Y3828@besplex.b!
	de.org> <505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Stephen Montgomery-Smith <stephen@missouri.edu>,
	freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 20:09:13 -0000

On Sat, 22 Sep 2012, Bruce Evans wrote:

> On Fri, 21 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> On 09/21/2012 06:18 PM, Bruce Evans wrote:

>>> ...
>>> The attachment was larger than intended.  It had my complete patch set
>>> for catrig*.c.
>> 
>> Will there be another complete patch set tomorrow, or did you just send it 
>> today?
>
> I sent it all and won't change much more for a while.  I might describe it
> more tomorrow.  Already made a small change: always use float for `tiny'
> (it is now only used in raise_inexact), and in raise_inexact assign
> (1 + tiny) to volatile float instead of volatile int.

Just 1 detail in the old patch needs more description.  First a new
patch to finish merging recent changes:

% diff -u2 catrig.c~ catrig.c
% --- catrig.c~	2012-09-22 04:49:51.000000000 +0000
% +++ catrig.c	2012-09-22 18:41:34.779454000 +0000
% @@ -35,5 +35,5 @@
%  #undef isnan
%  #define isnan(x)	((x) != (x))
% -#define	raise_inexact()	do { volatile int junk = 1 + tiny; } while(0)
% +#define	raise_inexact()	do { volatile float junk = 1 + tiny; } while(0)
%  #undef signbit
%  #define signbit(x)	(__builtin_signbit(x))

No reason to convert it to int.  (Not quite similarly for the (int)(1 + tiny).
(float)(1 + tiny) == 1 would have failed due to compiler bugfeatures unless
tiny is double_t or larger, since the bugfeatures elide the cast.  There
would have to have been an assignment to a volatile FP variable, directly
as here or via STRICT_ASSIGN (whose purpose is to avoid going through the
volatile variable when this is unnecessary).  The cast to int is really
needed when we have a small x and want to set inexact iff x != 0.  Perhaps
'if (x != 0) raise_inexact();' is a more efficient way to do that too, as
well as being unobfuscated.)

% @@ -48,12 +48,12 @@
%  m_ln2 =			6.9314718055994531e-1,	/*  0x162e42fefa39ef.0p-53 */
%  /*
% - * We no longer use M_PI_2 or m_pi_2.  In float precision, rounding to
% + * We no longer use M_PI_2 or m_pi_2.  In some precisions (although not
% + * in double precision where this comment is attached), rounding to
%   * nearest of PI/2 happens to round up, but we want rounding down so
%   * that the expressions for approximating PI/2 and (PI/2 - z) work in all
% - * rounding modes.  This is not very important, but it is necessary for
% - * the same quality of implementation that fdlibm had in 1992 and that
% - * real functions mostly still have.  This is known to be broken only in
% - * ld80 acosl() via invtrig.c and in some invalid optimizations in code
% - * under development, and now in all functions in catrigl.c via invtrig.c.
% + * rounding modes.  This is not very important, but the real inverse trig
% + * functions always took great care to do it, and all inverse trig
% + * functions are close working right in all rounding modes for their
% + * other approximations (unlike the non-inverse ones).
%   */
%  pio2_hi =		1.5707963267948966e0,	/*  0x1921fb54442d18.0p-52 */

Tone down this comment a bit.  You might want to remove it.

% @@ -64,6 +64,7 @@
% 
%  static const volatile double
% -pio2_lo =		6.1232339957367659e-17,	/*  0x11a62633145c07.0p-106 */
% -tiny =			0x1p-1000;
% +pio2_lo =		6.1232339957367659e-17;	/*  0x11a62633145c07.0p-106 */
% +static const volatile float
% +tiny =			0x1p-100;
% 
%  static double complex clog_for_large_values(double complex z);

`tiny' is now always float.  It was just wasteful for it to be larger.

% @@ -550,5 +551,5 @@
%  	if (ix <= (BIAS + DBL_MAX_EXP / 2 - CUTOFF) << 20)
%  		return (x/(x*x + y*y));
% -	scale = 0;
% +	scale = 1;
%  	SET_HIGH_WORD(scale, 0x7ff00000 - ix);	/* 2**(1-ilogb(x)) */
%  	x *= scale;

scale = 0 makes no sense for doubles either.  For floats, the mantissa
is part of the high word, so no separate initialization is needed, and
none is used.  I broke the long double case by copying the float code
and not initializing the mantissa bits at all (scale = 0 would have
given pseudo-zero, but uninitialzed scale gives almost anything).

% @@ -618,12 +619,7 @@
%  	}
% 
% -	if (ax == 1 && ay < DBL_EPSILON) {
% -#if 0 /* this only improves accuracy in an already relative accurate case */
% -		if (ay > 2*DBL_MIN)
% -			rx = - log(ay/2) / 2;
% -		else
% -#endif
% -			rx = - (log(ay) - m_ln2) / 2;
% -	} else
% +	if (ax == 1 && ay < DBL_EPSILON)
% +		rx = - (log(ay) - m_ln2) / 2;
% +	else
%  		rx = log1p(4*ax / sum_squares(ax-1, ay)) / 4;
%

I think this can be removed.

I explained the details of this a week or 2 ago.  Here log(ay) is large
compared with m_ln2, so there is an extra error of less than half an
ulp for adding m_ln2.  The error for log(ay) is < 1 ulp, so the total
error is < 1.5 ulps (in practice, < 1.2 ulps).  Since other parts of
catanh() have errors of 2-3 ulps, we shouldn't care about going above
1.2 ulps here.

I now understand catanh() well enough to see how to make its errors < 1
ulp using not much more than clog() needs to do the same things:
- use an extra-precision log() and log1p()
- evaluate |z-1|**2 accurately (already done in clog()
- divide accurately by the accurate |z-1|**2.  I peeked at the Intel
   ia64 math library atanh() and it reminded me that Newton's method
   is good for extra-precision division, and that I already use this
   method in an unfinished naive implementation of gamma().

     (The Intel ia64 math library is insanely complicated, efficient,
     accurate and large.  It takes about 30K of asm code for each of
     atanhf(), atanh() and atanhl(), each with optimizations specialized
     for the precision including a specialized inline log1p).

     (The naive implementation of gamma() uses the functional
     equation to shit the arg to a large one so that the asymptotic
     formala is accurate.  This takes lots of divisions to convert
     the result for the shifted arg to the result for the unshifted
     arg, and each division must be very accurate for final result
     to be even moderately accurate.  Not a good method, since even
     1 non-extra-precision division is slow.  But I was interested
     in seeing how far this method could be pushed.  It was barely
     good enough for lgammaf() near its first negative zero, when all
     intermediate calculations were done in sesqui-double precision.)

     (The Intel ia64 math library is of course insanely complicated,
     etc., for *gamma*().  Instead of 30K of asm per function, it
     takes 220K for lgammal() and significantly less for lower
     precisions.  It even uses large asm for the wrapper functions
     (pre-C90 support which we axed long ago).  It doesn't do any
     complex functions, at least in the 2005 glibc version.
     Altogether, in the glibc 2005 version, Intel *gamma*.S takes
     630K, which is slightly larger than all of msun/src in FreeBSD,
     and we do some complex functions.)

% diff -u2 catrigf.c~ catrigf.c
% --- catrigf.c~	2012-09-22 04:49:51.000000000 +0000
% +++ catrigf.c	2012-09-22 00:38:55.503733000 +0000
% @@ -45,5 +45,5 @@
%  #undef isnan
%  #define isnan(x)	((x) != (x))
% -#define	raise_inexact()	do { volatile int junk = 1 + tiny; } while(0)
% +#define	raise_inexact()	do { volatile float junk = 1 + tiny; } while(0)
%  #undef signbit
%  #define signbit(x)	(__builtin_signbitf(x))
% diff -u2 catrigl.c~ catrigl.c
% --- catrigl.c~	2012-09-22 05:42:13.000000000 +0000
% +++ catrigl.c	2012-09-22 18:23:27.597349000 +0000
% @@ -46,5 +46,5 @@
%  #undef isnan
%  #define isnan(x)	((x) != (x))
% -#define	raise_inexact()	do { volatile int junk = 1 + tiny; } while(0)
% +#define	raise_inexact()	do { volatile float junk = 1 + tiny; } while(0)
%  #undef signbit
%  #define signbit(x)	(__builtin_signbitl(x)) 
% @@ -78,5 +78,5 @@
%  #endif
% 
% -static const volatile long double
% +static const volatile float
%  tiny =			0x1p-10000L;
%

That's all the new changes.  Now from the old patch:

@ diff -u2 catrig.c~ catrig.c
@ --- catrig.c~	2012-09-21 15:51:00.000000000 +0000
@ +++ catrig.c	2012-09-22 18:41:34.779454000 +0000
@ @@ -577,20 +607,24 @@
@  ...
@  	if (ax == 1)
@  		ry = atan2(2, -ay) / 2;
@ -	else if (ay < FOUR_SQRT_MIN)
@ +	else if (ay < DBL_EPSILON)
@  		ry = atan2(2*ay, (1-ax)*(1+ax)) / 2;
@  	else

You accepted this without comment.  My calculation is that since ax
!= 1, |1-ax*ax| is at lease 2*DBL_EPSILON; ay < DBL_EPSILON makes
ay*ay < DBL_EPSILON**2, so it is insignificant.  This threshold might
be off by a small factor.

SQRT_MIN makes some sense as a threshold below which ay*ay would
underflow.  FOUR_SQRT_MIN makes less sense (I think it was just a
nearby handy constant).  Both need the estimate on |1-ax*ax| to
show that a gradually underflowing ay*ay can be dropped since it is
insignificant.

I think we would prefer to always evaluate the full |z-1|**2, but can't
do it because we want to avoid spurious underflow.  The complications
in catrig seem to be just as large for avoiding overflow and underflow
as for getting enough accuracy.

I now understand how to make the float case signifcantly more efficient
than the double case: calculate everything in extra precision and
exponent range, and depend on the extra exponent range preventing
underflow and overflow, so that everything can be simpler and faster.
More accuracy occurs even more automatically.  But this would be too
much work for the unimportant float case.  The double case is more
interesting, but optimizations for it using long double are only
possible on arches that have long doubles larger than doubles, and
only optimizations on arches that have efficient long doubles.  The
Intel ia64 math library of course has complications to do this.  It
generally uses extra precision in double precision routines, with
algorithms specialized for this, and then has to work harder in long
double precision and use different algorithms since no extra precision
is available.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 20:54:15 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 0ADC6106564A
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 20:54:15 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id B82398FC08
	for <freebsd-numerics@FreeBSD.org>;
	Sat, 22 Sep 2012 20:54:14 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8MKsDLf047053; Sat, 22 Sep 2012 15:54:13 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505E2575.6030302@missouri.edu>
Date: Sat, 22 Sep 2012 15:54:13 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu>
	<20120916041132.D6344@besplex.bde.org>
	<50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
	<20120923044814.S1465@besplex.bde.org>
In-Reply-To: <20120923044814.S1465@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 20:54:15 -0000

On 09/22/2012 03:09 PM, Bruce Evans wrote:

> % +static const volatile float
> %  tiny =            0x1p-10000L;

I assume you meant to also change tiny to 0x1p-100.

> %
>
> That's all the new changes.  Now from the old patch:
>
> @ diff -u2 catrig.c~ catrig.c
> @ --- catrig.c~    2012-09-21 15:51:00.000000000 +0000
> @ +++ catrig.c    2012-09-22 18:41:34.779454000 +0000
> @ @@ -577,20 +607,24 @@
> @  ...
> @      if (ax == 1)
> @          ry = atan2(2, -ay) / 2;
> @ -    else if (ay < FOUR_SQRT_MIN)
> @ +    else if (ay < DBL_EPSILON)
> @          ry = atan2(2*ay, (1-ax)*(1+ax)) / 2;
> @      else
>
> You accepted this without comment.  My calculation is that since ax
> != 1, |1-ax*ax| is at lease 2*DBL_EPSILON; ay < DBL_EPSILON makes
> ay*ay < DBL_EPSILON**2, so it is insignificant.  This threshold might
> be off by a small factor.

Yes, I think I wasn't paying attention.  But I agree with you.


From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 21:04:15 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DAA111065670
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 21:04:15 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 941E68FC08
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 21:04:15 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8ML4EBG048011 for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 16:04:14 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505E27CE.3060107@missouri.edu>
Date: Sat, 22 Sep 2012 16:04:14 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: freebsd-numerics@freebsd.org
References: <5017111E.6060003@missouri.edu> <50553424.2080902@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
	<20120923044814.S1465@besplex.bde.org>
	<505E2575.6030302@missouri.edu>
In-Reply-To: <505E2575.6030302@missouri.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 21:04:16 -0000

1. Your recent optimizations seem to have given an overall 3% time 
saving in my timing tests.  That's pretty good in my opinion.

2.  In my accuracy tests for casin(h), I have never seen the double or 
long double have an error greater than 4 ULP.  But for the float case I 
have seen 4.15 ULP.

3.  I saw that you have ideas on making catanh have an error less than 1 
ULP.  Just saying that I saw those comments, although I didn't read them 
very carefully.


From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 21:12:41 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 424A7106566C
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 21:12:41 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id EFA798FC0C
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 21:12:40 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8MLCdiF048573 for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 16:12:40 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505E29C8.6030305@missouri.edu>
Date: Sat, 22 Sep 2012 16:12:40 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: freebsd-numerics@freebsd.org
References: <5017111E.6060003@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
	<20120923044814.S1465@besplex.bde.org>
	<505E2575.6030302@missouri.edu> <505E27CE.3060107@missouri.edu>
In-Reply-To: <505E27CE.3060107@missouri.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 21:12:41 -0000

Here is a little cleaning in the code.

	/* To ensure the same accuracy as atan(), and to filter out z = 0. */
	if (x == 0)
		return (cpack(x, atan(y)));

	if (isnan(x) || isnan(y)) {
		/* catanh(+-Inf + I*NaN) = +-0 + I*NaN */
		if (isinf(x))
			return (cpack(copysign(0, x), y+y));
		/* catanh(NaN + I*+-Inf) = sign(NaN)0 + I*+-PI/2 */
		if (isinf(y))
			return (cpack(copysign(0, x), copysign(pio2_hi + pio2_lo, y)));
-		/* catanh(+-0 + I*NaN) = +-0 + I*NaN */
-		if (x == 0)
-			return (cpack(x, y+y));


From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 21:17:44 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 894881065670
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 21:17:44 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au
	[211.29.132.186])
	by mx1.freebsd.org (Postfix) with ESMTP id F2DE88FC08
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 21:17:43 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8MLHZjr000737
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 23 Sep 2012 07:17:36 +1000
Date: Sun, 23 Sep 2012 07:17:35 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505E2575.6030302@missouri.edu>
Message-ID: <20120923071717.G1963@besplex.bde.org>
References: <5017111E.6060003@missouri.edu>
	<20120916134730.Y957@besplex.bde.org>
	<5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org> <505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu> <20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org> <20120922091625.Y3828@besplex.b!
	de.org> <505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
	<20120923044814.S1465@besplex.bde.org> <505E2575.6030302@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@freebsd.org, Bruce Evans <brde@optusnet.com.au>
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 21:17:44 -0000

On Sat, 22 Sep 2012, Stephen Montgomery-Smith wrote:

> On 09/22/2012 03:09 PM, Bruce Evans wrote:
>
>> % +static const volatile float
>> %  tiny =            0x1p-10000L;
>
> I assume you meant to also change tiny to 0x1p-100.

Right.  Oops.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 21:47:38 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id CC78A106566B
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 21:47:38 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au
	[211.29.132.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 5BF5F8FC14
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 21:47:37 +0000 (UTC)
Received: from c122-106-157-84.carlnfd1.nsw.optusnet.com.au
	(c122-106-157-84.carlnfd1.nsw.optusnet.com.au [122.106.157.84])
	by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8MLlTQc012562
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 23 Sep 2012 07:47:30 +1000
Date: Sun, 23 Sep 2012 07:47:29 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <505E27CE.3060107@missouri.edu>
Message-ID: <20120923073807.K2059@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
	<5057F24B.7020605@missouri.edu> <20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu> <20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org> <505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org> <505CBF14.70908@missouri.edu>
	<505CC11A.5030502@missouri.edu> <20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
	<20120923044814.S1465@besplex.bde.org>
	<505E2575.6030302@missouri.edu> <505E27CE.3060107@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 21:47:38 -0000

On Sat, 22 Sep 2012, Stephen Montgomery-Smith wrote:

> 1. Your recent optimizations seem to have given an overall 3% time saving in 
> my timing tests.  That's pretty good in my opinion.

Hopefully more for large and small args :-).

> 2.  In my accuracy tests for casin(h), I have never seen the double or long 
> double have an error greater than 4 ULP.  But for the float case I have seen 
> 4.15 ULP.

I haven't seen any larger than 3.4.  What is the worst case you found?
Errors found for float precision tend to be because the density of bad
cases is higher so it is easier to test more of them accidentally.  I
did do some non-random testing for all float cases in narrow strips
about x or y = 0 or 1, but not for all combinations of this with all
functions.

Bruce

From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 22:25:43 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AA6781065673
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 22:25:43 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 64DE78FC17
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 22:25:42 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q8MMPf0f026359; Sat, 22 Sep 2012 17:25:41 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <505E3AE6.2010006@missouri.edu>
Date: Sat, 22 Sep 2012 17:25:42 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <5055ECA8.2080008@missouri.edu>
	<20120917022614.R2943@besplex.bde.org>
	<50562213.9020400@missouri.edu>
	<20120917060116.G3825@besplex.bde.org>
	<50563C57.60806@missouri.edu>
	<20120918012459.V5094@besplex.bde.org>
	<5057A932.3000603@missouri.edu> <5057F24B.7020605@missouri.edu>
	<20120918162105.U991@besplex.bde.org>
	<20120918232850.N2144@besplex.bde.org>
	<20120919010613.T2493@besplex.bde.org>
	<505BD9B4.8020801@missouri.edu>
	<20120921172402.W945@besplex.bde.org>
	<20120921212525.W1732@besplex.bde.org>
	<505C7490.90600@missouri.edu>
	<20120922042112.E3044@besplex.bde.org>
	<505CBF14.70908@missouri.edu> <505CC11A.5030502@missouri.edu>
	<20120922081607.F3613@besplex.bde.org>
	<20120922091625.Y3828@besplex.b! de.org>
	<505D1037.8010202@missouri.edu>
	<20120922142349.X4599@besplex.bde.org>
	<20120923044814.S1465@besplex.bde.org>
	<505E2575.6030302@missouri.edu> <505E27CE.3060107@missouri.edu>
	<20120923073807.K2059@besplex.bde.org>
In-Reply-To: <20120923073807.K2059@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 22:25:43 -0000

On 09/22/2012 04:47 PM, Bruce Evans wrote:
> On Sat, 22 Sep 2012, Stephen Montgomery-Smith wrote:
>
>> 2.  In my accuracy tests for casin(h), I have never seen the double or
>> long double have an error greater than 4 ULP.  But for the float case
>> I have seen 4.15 ULP.
>
> I haven't seen any larger than 3.4.  What is the worst case you found?
> Errors found for float precision tend to be because the density of bad
> cases is higher so it is easier to test more of them accidentally.  I
> did do some non-random testing for all float cases in narrow strips
> about x or y = 0 or 1, but not for all combinations of this with all
> functions.

Here are some examples for float.  In all these outputs:
The first entry is the "count".
The second entry is the function.
The third and fourth entries are the real and imaginary part of the 
error in ULP.
The fifth and sixth entries are the real and imaginary part of the input.
The seventh and eighth and ninth and tenth entries are the real part and 
imaginary part of the answers from the float/double respectively 
(printed to few enough decimal places that you cannot tell they are 
different.)

2365614 acos 3.75621 0.86681 1.0338860750198364258 
-0.090228326618671417236 0.246582 0.361712 0.246582 0.361712
3087248 acos 3.56538 0.1165 2.3730618953704833984 0.26976472139358520508 
0.124496 -1.51821 0.124496 -1.51821
5973027 asinh 3.61544 0.513 0.10977014899253845215 
0.48254761099815368652 0.124712 0.499309 0.124712 0.499309
6558511 acosh 3.57286 0.419525 -0.29658588767051696777 
-0.11975207924842834473 0.124975 -1.8695 0.124975 -1.8695
9998127 acos 3.51324 1.09793 1.0892471075057983398 
-0.12541522085666656494 0.247452 0.491951 0.247452 0.491951
14879751 asinh 3.5643 1.83067 -0.11303693056106567383 
0.4351412653923034668 -0.124994 0.446448 -0.124994 0.446448
19510082 asin 3.61922 0.0103899 0.46096378564834594727 
-0.01612871512770652771 0.478995 -0.0181731 0.478995 -0.0181731

I can send more examples on request.  I'm not seeing a real pattern here.

From owner-freebsd-numerics@FreeBSD.ORG  Sat Sep 22 23:46:39 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 81FA4106566B
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 23:46:39 +0000 (UTC)
	(envelope-from m.e.sanliturk@gmail.com)
Received: from mail-oa0-f54.google.com (mail-oa0-f54.google.com
	[209.85.219.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 411F88FC08
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 23:46:39 +0000 (UTC)
Received: by oagm1 with SMTP id m1so5660913oag.13
	for <freebsd-numerics@freebsd.org>;
	Sat, 22 Sep 2012 16:46:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:date:message-id:subject:from:to:content-type;
	bh=q/YQ5GthtkHc/xsyR9m2stOY8y5DyQZ/TZMBQ9Dl7/A=;
	b=HaJN94nG3lGUavCLGK0blD+udHyMfuTMBj7IFletfZN8K6qhXIiCXMLJw97BtJSdvW
	NpEhviS6sCP25lNgziNq3xCOj0TSkFCHDEb8aGxS6EswoBT8sdCT2bnZxxnKlfUcQrqJ
	P4KMBsmaN0aeWdj+0WdROyIgFXPcRwmA73rDRRYyejsCjrlEow9Ilv5XOswJy/DZxvoC
	nzC/priGtuS/5Onj2txEvbCPgk3+Tj0dvHQNzd1ZGCx/KlujEVX+WnJhdBcm7jDLND8G
	zJ8yBbjLYEFtFhCBRTyM95tdywo0Whq6uaNoTIgZ2eYn7KO0P32bKeGbwsxqZN0AnyCU
	U6Xw==
MIME-Version: 1.0
Received: by 10.182.76.194 with SMTP id m2mr6921147obw.27.1348357598258; Sat,
	22 Sep 2012 16:46:38 -0700 (PDT)
Received: by 10.182.141.66 with HTTP; Sat, 22 Sep 2012 16:46:38 -0700 (PDT)
Date: Sat, 22 Sep 2012 16:46:38 -0700
Message-ID: <CAOgwaMvw-u0yHs2RxBdGV+41knhvrzafO9umuswwBbRK+h3ahQ@mail.gmail.com>
From: Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com>
To: freebsd-numerics@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Book names about Computer Approximations
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Sep 2012 23:46:39 -0000

Dear All ,

I want to buy some books about computer approximations to
functions such as elementary , distributions , etc.
like the books


http://www.amazon.com/Approximations-Digital-Computers-Cecil-Hastings/dp/B000Q5GBG6/ref=sr_1_1?s=books&ie=UTF8&qid=1348357087&sr=1-1
http://www.amazon.com/Computer-Approximations-John-Fraser-Hart/dp/0882756427
http://www.amazon.com/Elementary-Functions-Implementation-Jean-Michel-Muller/dp/0817643729/ref=pd_sim_sbs_b_2
http://www.amazon.com/Elementary-Functions-Prentice-Hall-computational-mathematics/dp/0138220646/ref=pd_sim_sbs_b_3


I have searched "review of computer approximation books" , but I could not
find
any useful source .

I am not near to a library or a bookseller
( even it is not possible to find such books in Turkey ,
it is necessary to order them ) to see sample copies .

If you have time , would you please suggest names , or links , or ISBN
numbers ,
whichever is suitable for you ,
which I can find from publishers ,
especially recently published and can be used to develop
good quality procedures from their contents .


Thank you very much .

Mehmet Erol Sanliturk