From owner-freebsd-numerics@FreeBSD.ORG  Thu Sep  6 16:32:23 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7C31810656D4
	for <freebsd-numerics@freebsd.org>;
	Thu,  6 Sep 2012 16:32:23 +0000 (UTC)
	(envelope-from stephen@missouri.edu)
Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu
	[128.206.184.213])
	by mx1.freebsd.org (Postfix) with ESMTP id 1408C8FC15
	for <freebsd-numerics@freebsd.org>;
	Thu,  6 Sep 2012 16:32:22 +0000 (UTC)
Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213])
	by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id
	q86GWBZh069525; Thu, 6 Sep 2012 11:32:14 -0500 (CDT)
	(envelope-from stephen@missouri.edu)
Message-ID: <5048D00B.8010401@missouri.edu>
Date: Thu, 06 Sep 2012 11:32:11 -0500
From: Stephen Montgomery-Smith <stephen@missouri.edu>
User-Agent: Mozilla/5.0 (X11; Linux i686;
	rv:15.0) Gecko/20120827 Thunderbird/15.0
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu>
	<20120804165555.X1231@besplex.bde.org>
	<501D51D7.1020101@missouri.edu>
	<20120805030609.R3101@besplex.bde.org>
	<501D9C36.2040207@missouri.edu>
	<20120805175106.X3574@besplex.bde.org>
	<501EC015.3000808@missouri.edu>
	<20120805191954.GA50379@troutmask.apl.washington.edu>
	<20120807205725.GA10572@server.rulingia.com>
	<20120809025220.N4114@besplex.bde.org>
	<5027F07E.9060409@missouri.edu>
	<20120814003614.H3692@besplex.bde.org>
	<50295F5C.6010800@missouri.edu>
	<20120814072946.S5260@besplex.bde.org>
	<50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org>
	<502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org>
	<502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org>
In-Reply-To: <20120906221028.O1542@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-numerics@freebsd.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Sep 2012 16:32:23 -0000

Life is busy, so it will be a few days before I can process most of what 
you have said.  But let me respond to a few of the points.

On 09/06/2012 08:50 AM, Bruce Evans wrote:

> This code in catrig.c seems to be related:
>
> @ /*
> @  * sum_squares(x,y) = x*x + y*y.
> @  * Assumes x*x and y*y will not overflow.
> @  * Assumes x and y are finite.
> @  * Assumes fabs(x) >= DBL_EPSILON.
> @  */
> @ inline static double
> @ sum_squares(double x, double y)
> @ {
> @     /*
> @      * The following code seems to do better than
> @      * return (hypot(x, y) * hypot(x, y));
> @      */
> @ @     if (fabs(y) < MIN_4TH_ROOT)
> @         if ((int)y==0) /* raise inexact */
> @             return (x*x);
> @     return (x*x + y*y);
> @ }
>
> but it isn't really -- it is more like the case of a large difference
> of exponents, so x*x+y*y reduces to x*x (at a smaller exponent difference
> than for hypot(x, y) reducing to |x|).  hypot() sets inexact by returning
> |x|+|y| in that case, but the above avoids using y*y since it would give
> spurious underflow.
>
> Hmm, MIN_4TH_ROOT seems to be logically wrong here, and physically
> wrong for float precision.  In double precision, MIN_4TH_ROOT/DBL_EPSILON
> is ~0.5e-59, but in float precision, MIN_4TH_ROOT/FLT_EPSILON is only
> ~1e-2, so when x = FLT_EPSILON, y = MIN_4TH_ROOT is very far from giving
> a discardable y*y.
>
> The logically correct threshold is something like y/x < 2**-MANT_DIG/2
> or y < 2**-MANT_DIG/2 * EPSILON to avoid the division.  clog*() uses
> essentially y/x < 2**-MANT_DIG for some reason (it actually uses a fuzzy
> check based on exponent differences: kx - ky > MANT_DIG; MANT_DIG/2
> would be a little too small here, but doubling it seems wasteful).
> hypot() uses essentially y/x < 2**-60, where the magic 60 is DBL_MANT_DIG
> plys a safety margin.  hypotf() uses a magic 30.  The non-magic integer
> exponent thresholds work very well in clog*(), but depend on having the
> exponents handy for efficiency, which may be possible via optimization
> for an inline function like the above but takes more source code.

The code is different for catrigf.c for precisely the reason you stated. 
  This is what I was talking about a while back when I said that the 
float code was a little harder than the double and long code.

In the days or weeks to come, I might go over all these kinds of 
conditions over again.  The paper by Hull et al, and the code used by 
boost, is written in such a way that the code for float is the same as 
the code for double.  My code was designed just enough to work, whereas 
they put more thought into it.

> I don't see how the above avoids cancelation problems like the ones in
> clog().  You never subtract 1 from sum_squares(), but there are open-coded
> expressions like '(1-ax)*(1+ax) - ay*ay' to reduce the cancelation error.
> I found that for clog() it was worth using extra precision out to about
> |z| = 4 (or |z| < 1/4) to reduce the cancelation error.  But somehow my
> tests didn't show large errors.  Perhaps because I'm happy with < 4 ulps
> instead of < 1.

Because for clog, when x^2+y^2 is close to one, the real part of clog is 
very close to zero.

Whereas when x^2+y^2 is close to 1, the imaginary part of catanh is 
close to Pi/4.

So even though the absolute error in computing Im(catanh) might be much 
higher than Re(clog), the relative error is comparable.