From owner-freebsd-numerics@FreeBSD.ORG  Wed Sep 12 09:57:01 2012
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id AF6B0106564A
	for <freebsd-numerics@FreeBSD.org>;
	Wed, 12 Sep 2012 09:57:01 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au
	[211.29.132.187])
	by mx1.freebsd.org (Postfix) with ESMTP id 21AB78FC18
	for <freebsd-numerics@FreeBSD.org>;
	Wed, 12 Sep 2012 09:57:00 +0000 (UTC)
Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au
	(c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246])
	by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	q8C9upbh024316
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 12 Sep 2012 19:56:53 +1000
Date: Wed, 12 Sep 2012 19:56:51 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Stephen Montgomery-Smith <stephen@missouri.edu>
In-Reply-To: <504FF726.9060001@missouri.edu>
Message-ID: <20120912191556.F1078@besplex.bde.org>
References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu>
	<20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu>
	<20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu>
	<20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu>
	<20120805191954.GA50379@troutmask.apl.washington.edu>
	<20120807205725.GA10572@server.rulingia.com>
	<20120809025220.N4114@besplex.bde.org>
	<5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org>
	<50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org>
	<50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu>
	<20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu>
	<20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu>
	<20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu>
	<504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-numerics@FreeBSD.org
Subject: Re: Complex arg-trig functions
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
	<freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
	<mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Sep 2012 09:57:01 -0000

On Tue, 11 Sep 2012, Stephen Montgomery-Smith wrote:

> On 09/09/2012 08:05 PM, Stephen Montgomery-Smith wrote:
>> On 09/06/2012 11:32 AM, Stephen Montgomery-Smith wrote:
>> 
>>> In the days or weeks to come, I might go over all these kinds of
>>> conditions over again.  The paper by Hull et al, and the code used by
>>> boost, is written in such a way that the code for float is the same as
>>> the code for double.  My code was designed just enough to work, whereas
>>> they put more thought into it.
>> 
>> I am doing this right now.  The code I currently have on the web page is
>> non-functional.
>
> 1.  I think I have it fixed now.  My code is much closer to the original 
> intent of the algorithm used by Hull et al.  In particular, the float version 
> is just as easy as the double version.

This is better in theory.  It changes the number of incorrectly rounded
cases a little, probably insignificantly.

> 2.  The long and float version now are created from the double version by 
> simple perl scripts (also put on my web page).

I have similar obfuscations using #define for clog, but don't like them.
Apart from obfuscation, the perl scripts are too simple and give the
following bugs:
- they don't add F suffixes to float constants, so the float version does
   lots of double arithmetic again
- they don't clean up the formatting after removing comments.

> 3.  I used the "return (cpack(x+0.0L+(y+0), x+0.0L+(y+0)))" construction, but 
> I still don't know why.  I added a comment basically saying I did this 
> because Bruce told me to.

Try testing for equality of NaNs in bits and you will soon see why you
don't want the bits to vary randomly.  Of course, the variation isn't
random, but if it is non-null then you need to understand the details of
it to ignore only unimportant differences.

Preliminary testing shows the following regressions relative to me previous
fixed version.  (I got things like cacosh(NaN + I Inf) consistently wrong
by classifying NaNs first, while not classifying NaNs first gets things
like cacosh(NaN + I large) inconsistently wrong.)

float precision:
% 1,6c1,6
% old> rcacos:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 3786:214820
% old> rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128
% old> rcasin:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34228:360900
% old> rcasinh:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128
% old> rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836
% old> rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #>=1:0.5 = 4096:420916
% ---
% new> rcacos:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.314, #>=1:0.5 = 12032:219512
% new> rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.250, #>=1:0.5 = 1392:3151932
% new> rcasin:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.208, #>=1:0.5 = 38784:366480
% new> rcasinh:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 5104:3155644
% new> rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.287, #>=1:0.5 = 37132:302960
% new> rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.166, #>=1:0.5 = 4096:357284

The huge errors for rcacos = realf(cacosf) etc. are sign errors for NaNs
like the following:

% x =                           0x4b000001 8388609
% y =                           0xff800001 nan
% rcacos(x, y) = 0xfff8000020000000 0xffc00001 nan
% rcacosf(x, y) = 0x7ff8000020000000 0x7fc00001 nan
% err = -0x8000000000000000 17179869184.00000

rcacos() correctly preserves the sign of the NaN, but rcacosf() loses it
iff |x| > 1 / FLT_EPSILON.  All of realf(clog_with_large_values()),
realf(clogf()) and hypotf() seem never return a NaN with its sign bit set.
This makes some sense for an real absolute value function but not for a
complex function.  rcacos() loses the sign of the NaN similarly starting
at 1 / DBL_EPSILON.

The huge errors hide the peak error for 3 of the above 6 functions.  For
the other 3, the peak error is unchanged and the number of inncorrectly
rounded cases is changed a little.

% 16,21c16,21
% old> icacos:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128
% old> icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 3786:214820
% old> icasin:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128
% old> icasinh:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34228:360900
% old> icatan:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #>=1:0.5 = 4096:420916
% old> icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836
% ---
% new> icacos:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 12528:3163068
% new> icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 4608:212088
% new> icasin:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 5104:3155644
% new> icasinh:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.208, #>=1:0.5 = 38784:366480
% new> icatan:max_er = 0x2cef3171 1.4042, avg_er = 0.166, #>=1:0.5 = 4096:357284
% new> icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.287, #>=1:0.5 = 37132:302960

Now double precision:

% 31,36c31,36
% old> rcacos:max_er =     0x1b5a 3.4189, avg_er = 0.166, #>=1:0.5 = 2394:118150
% old> rcacosh:max_er =      0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804
% old> rcasin:max_er =     0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33308:99188
% old> rcasinh:max_er =      0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804
% old> rcatan:max_er =     0x1622 2.7666, avg_er = 0.016, #>=1:0.5 = 4680:81364
% old> rcatanh:max_er =     0x14b9 2.5903, avg_er = 0.047, #>=1:0.5 = 428996:691360
% ---
% new> rcacos:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.168, #>=1:0.5 = 2458:110404
% new> rcacosh:max_er =      0xf8a 1.9424, avg_er = 0.257, #>=1:0.5 = 24192:2683944
% new> rcasin:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.113, #>=1:0.5 = 33886:100306
% new> rcasinh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24214:2683966
% new> rcatan:max_er =     0x1622 2.7666, avg_er = 0.015, #>=1:0.5 = 20580:101564
% new> rcatanh:max_er =     0x14b9 2.5903, avg_er = 0.044, #>=1:0.5 = 403028:652792
% 39,44c39,44
% old> icacos:max_er =      0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804
% old> icacosh:max_er =     0x1b5a 3.4189, avg_er = 0.166, #>=1:0.5 = 2394:118150
% old> icasin:max_er =      0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804
% old> icasinh:max_er =     0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33308:99188
% old> icatan:max_er =     0x14b9 2.5903, avg_er = 0.047, #>=1:0.5 = 428996:691360
% old> icatanh:max_er =     0x1622 2.7666, avg_er = 0.016, #>=1:0.5 = 4680:81364
% ---
% new> icacos:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24258:2684010
% new> icacosh:max_er =     0x1b5a 3.4189, avg_er = 0.168, #>=1:0.5 = 2414:110360
% new> icasin:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24214:2683966
% new> icasinh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.113, #>=1:0.5 = 33886:100306
% new> icatan:max_er =     0x14b9 2.5903, avg_er = 0.044, #>=1:0.5 = 403028:652792
% new> icatanh:max_er =     0x1622 2.7666, avg_er = 0.015, #>=1:0.5 = 20580:101564

Bruce