From owner-freebsd-numerics@FreeBSD.ORG Wed Sep 12 09:57:01 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AF6B0106564A for ; Wed, 12 Sep 2012 09:57:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id 21AB78FC18 for ; Wed, 12 Sep 2012 09:57:00 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q8C9upbh024316 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 12 Sep 2012 19:56:53 +1000 Date: Wed, 12 Sep 2012 19:56:51 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <504FF726.9060001@missouri.edu> Message-ID: <20120912191556.F1078@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Sep 2012 09:57:01 -0000 On Tue, 11 Sep 2012, Stephen Montgomery-Smith wrote: > On 09/09/2012 08:05 PM, Stephen Montgomery-Smith wrote: >> On 09/06/2012 11:32 AM, Stephen Montgomery-Smith wrote: >> >>> In the days or weeks to come, I might go over all these kinds of >>> conditions over again. The paper by Hull et al, and the code used by >>> boost, is written in such a way that the code for float is the same as >>> the code for double. My code was designed just enough to work, whereas >>> they put more thought into it. >> >> I am doing this right now. The code I currently have on the web page is >> non-functional. > > 1. I think I have it fixed now. My code is much closer to the original > intent of the algorithm used by Hull et al. In particular, the float version > is just as easy as the double version. This is better in theory. It changes the number of incorrectly rounded cases a little, probably insignificantly. > 2. The long and float version now are created from the double version by > simple perl scripts (also put on my web page). I have similar obfuscations using #define for clog, but don't like them. Apart from obfuscation, the perl scripts are too simple and give the following bugs: - they don't add F suffixes to float constants, so the float version does lots of double arithmetic again - they don't clean up the formatting after removing comments. > 3. I used the "return (cpack(x+0.0L+(y+0), x+0.0L+(y+0)))" construction, but > I still don't know why. I added a comment basically saying I did this > because Bruce told me to. Try testing for equality of NaNs in bits and you will soon see why you don't want the bits to vary randomly. Of course, the variation isn't random, but if it is non-null then you need to understand the details of it to ignore only unimportant differences. Preliminary testing shows the following regressions relative to me previous fixed version. (I got things like cacosh(NaN + I Inf) consistently wrong by classifying NaNs first, while not classifying NaNs first gets things like cacosh(NaN + I large) inconsistently wrong.) float precision: % 1,6c1,6 % old> rcacos:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 3786:214820 % old> rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old> rcasin:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34228:360900 % old> rcasinh:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old> rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836 % old> rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #>=1:0.5 = 4096:420916 % --- % new> rcacos:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.314, #>=1:0.5 = 12032:219512 % new> rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.250, #>=1:0.5 = 1392:3151932 % new> rcasin:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.208, #>=1:0.5 = 38784:366480 % new> rcasinh:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 5104:3155644 % new> rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.287, #>=1:0.5 = 37132:302960 % new> rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.166, #>=1:0.5 = 4096:357284 The huge errors for rcacos = realf(cacosf) etc. are sign errors for NaNs like the following: % x = 0x4b000001 8388609 % y = 0xff800001 nan % rcacos(x, y) = 0xfff8000020000000 0xffc00001 nan % rcacosf(x, y) = 0x7ff8000020000000 0x7fc00001 nan % err = -0x8000000000000000 17179869184.00000 rcacos() correctly preserves the sign of the NaN, but rcacosf() loses it iff |x| > 1 / FLT_EPSILON. All of realf(clog_with_large_values()), realf(clogf()) and hypotf() seem never return a NaN with its sign bit set. This makes some sense for an real absolute value function but not for a complex function. rcacos() loses the sign of the NaN similarly starting at 1 / DBL_EPSILON. The huge errors hide the peak error for 3 of the above 6 functions. For the other 3, the peak error is unchanged and the number of inncorrectly rounded cases is changed a little. % 16,21c16,21 % old> icacos:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old> icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 3786:214820 % old> icasin:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old> icasinh:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34228:360900 % old> icatan:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #>=1:0.5 = 4096:420916 % old> icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836 % --- % new> icacos:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 12528:3163068 % new> icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 4608:212088 % new> icasin:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 5104:3155644 % new> icasinh:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.208, #>=1:0.5 = 38784:366480 % new> icatan:max_er = 0x2cef3171 1.4042, avg_er = 0.166, #>=1:0.5 = 4096:357284 % new> icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.287, #>=1:0.5 = 37132:302960 Now double precision: % 31,36c31,36 % old> rcacos:max_er = 0x1b5a 3.4189, avg_er = 0.166, #>=1:0.5 = 2394:118150 % old> rcacosh:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804 % old> rcasin:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33308:99188 % old> rcasinh:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804 % old> rcatan:max_er = 0x1622 2.7666, avg_er = 0.016, #>=1:0.5 = 4680:81364 % old> rcatanh:max_er = 0x14b9 2.5903, avg_er = 0.047, #>=1:0.5 = 428996:691360 % --- % new> rcacos:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.168, #>=1:0.5 = 2458:110404 % new> rcacosh:max_er = 0xf8a 1.9424, avg_er = 0.257, #>=1:0.5 = 24192:2683944 % new> rcasin:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.113, #>=1:0.5 = 33886:100306 % new> rcasinh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24214:2683966 % new> rcatan:max_er = 0x1622 2.7666, avg_er = 0.015, #>=1:0.5 = 20580:101564 % new> rcatanh:max_er = 0x14b9 2.5903, avg_er = 0.044, #>=1:0.5 = 403028:652792 % 39,44c39,44 % old> icacos:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804 % old> icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.166, #>=1:0.5 = 2394:118150 % old> icasin:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804 % old> icasinh:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33308:99188 % old> icatan:max_er = 0x14b9 2.5903, avg_er = 0.047, #>=1:0.5 = 428996:691360 % old> icatanh:max_er = 0x1622 2.7666, avg_er = 0.016, #>=1:0.5 = 4680:81364 % --- % new> icacos:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24258:2684010 % new> icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.168, #>=1:0.5 = 2414:110360 % new> icasin:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24214:2683966 % new> icasinh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.113, #>=1:0.5 = 33886:100306 % new> icatan:max_er = 0x14b9 2.5903, avg_er = 0.044, #>=1:0.5 = 403028:652792 % new> icatanh:max_er = 0x1622 2.7666, avg_er = 0.015, #>=1:0.5 = 20580:101564 Bruce