From owner-freebsd-numerics@FreeBSD.ORG Sun Sep 9 23:42:09 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CED5B1065672 for ; Sun, 9 Sep 2012 23:42:09 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 4FA8A8FC15 for ; Sun, 9 Sep 2012 23:42:08 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q89Ng6ZC068241; Sun, 9 Sep 2012 18:42:07 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <504D294F.1050901@missouri.edu> Date: Sun, 09 Sep 2012 18:42:07 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <503265E8.3060101@missouri.edu> <5036EFDB.3010304@missouri.edu> <503A9A1A.1050004@missouri.edu> <503ABAC4.3060503@missouri.edu> <20120906200420.H1056@besplex.bde.org> In-Reply-To: <20120906200420.H1056@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Sep 2012 23:42:09 -0000 On 09/06/2012 06:42 AM, Bruce Evans wrote: > I'm back from a holiday. > > On Sun, 26 Aug 2012, Stephen Montgomery-Smith wrote: > I tested your latest version and found regressions for NaNs. (Also, > the float version is less accurate now that it doesn't call double > functions, but that is a progression.) > 1. casin*() abuses x and y for |x| and |y| and thus loses the sign of > NaNs. The quick fix is to reload the original values. The correct > fix is to use separate variables for the absolute values, as is > done for cacosh*() and catanh*(). I now uniformly use ax and ay for |x| and |y| throughout. > 2. NaNs are now filtered out early It is not correct to filter out the NaNs early. This is because of the way expressions like cacosh(NaN + I*INF) are defined in C99. The INFs must be filtered out first. > % @@ -289,7 +284,16 @@ > % * the arguments is not NaN, so we opt not to raise it. > % */ > % - return (cpack(x+y, x+y)); > % + return (cpack(x+0.0L+(y+0), x+0.0L+(y+0))); > % } Why does the second way work, and the first way doesn't? From owner-freebsd-numerics@FreeBSD.ORG Mon Sep 10 01:05:20 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6257C1065670 for ; Mon, 10 Sep 2012 01:05:20 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 1FC878FC0C for ; Mon, 10 Sep 2012 01:05:19 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q8A15HOd077004 for ; Sun, 9 Sep 2012 20:05:18 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <504D3CCD.2050006@missouri.edu> Date: Sun, 09 Sep 2012 20:05:17 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> In-Reply-To: <5048D00B.8010401@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Sep 2012 01:05:20 -0000 On 09/06/2012 11:32 AM, Stephen Montgomery-Smith wrote: > In the days or weeks to come, I might go over all these kinds of > conditions over again. The paper by Hull et al, and the code used by > boost, is written in such a way that the code for float is the same as > the code for double. My code was designed just enough to work, whereas > they put more thought into it. I am doing this right now. The code I currently have on the web page is non-functional. From owner-freebsd-numerics@FreeBSD.ORG Wed Sep 12 02:45:01 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9588D106564A for ; Wed, 12 Sep 2012 02:45:01 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 518C18FC0A for ; Wed, 12 Sep 2012 02:45:01 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q8C2isv1080386 for ; Tue, 11 Sep 2012 21:44:55 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <504FF726.9060001@missouri.edu> Date: Tue, 11 Sep 2012 21:44:54 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> In-Reply-To: <504D3CCD.2050006@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Sep 2012 02:45:01 -0000 On 09/09/2012 08:05 PM, Stephen Montgomery-Smith wrote: > On 09/06/2012 11:32 AM, Stephen Montgomery-Smith wrote: > >> In the days or weeks to come, I might go over all these kinds of >> conditions over again. The paper by Hull et al, and the code used by >> boost, is written in such a way that the code for float is the same as >> the code for double. My code was designed just enough to work, whereas >> they put more thought into it. > > I am doing this right now. The code I currently have on the web page is > non-functional. 1. I think I have it fixed now. My code is much closer to the original intent of the algorithm used by Hull et al. In particular, the float version is just as easy as the double version. 2. The long and float version now are created from the double version by simple perl scripts (also put on my web page). 3. I used the "return (cpack(x+0.0L+(y+0), x+0.0L+(y+0)))" construction, but I still don't know why. I added a comment basically saying I did this because Bruce told me to. Stephen From owner-freebsd-numerics@FreeBSD.ORG Wed Sep 12 09:57:01 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AF6B0106564A for ; Wed, 12 Sep 2012 09:57:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id 21AB78FC18 for ; Wed, 12 Sep 2012 09:57:00 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q8C9upbh024316 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 12 Sep 2012 19:56:53 +1000 Date: Wed, 12 Sep 2012 19:56:51 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <504FF726.9060001@missouri.edu> Message-ID: <20120912191556.F1078@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Sep 2012 09:57:01 -0000 On Tue, 11 Sep 2012, Stephen Montgomery-Smith wrote: > On 09/09/2012 08:05 PM, Stephen Montgomery-Smith wrote: >> On 09/06/2012 11:32 AM, Stephen Montgomery-Smith wrote: >> >>> In the days or weeks to come, I might go over all these kinds of >>> conditions over again. The paper by Hull et al, and the code used by >>> boost, is written in such a way that the code for float is the same as >>> the code for double. My code was designed just enough to work, whereas >>> they put more thought into it. >> >> I am doing this right now. The code I currently have on the web page is >> non-functional. > > 1. I think I have it fixed now. My code is much closer to the original > intent of the algorithm used by Hull et al. In particular, the float version > is just as easy as the double version. This is better in theory. It changes the number of incorrectly rounded cases a little, probably insignificantly. > 2. The long and float version now are created from the double version by > simple perl scripts (also put on my web page). I have similar obfuscations using #define for clog, but don't like them. Apart from obfuscation, the perl scripts are too simple and give the following bugs: - they don't add F suffixes to float constants, so the float version does lots of double arithmetic again - they don't clean up the formatting after removing comments. > 3. I used the "return (cpack(x+0.0L+(y+0), x+0.0L+(y+0)))" construction, but > I still don't know why. I added a comment basically saying I did this > because Bruce told me to. Try testing for equality of NaNs in bits and you will soon see why you don't want the bits to vary randomly. Of course, the variation isn't random, but if it is non-null then you need to understand the details of it to ignore only unimportant differences. Preliminary testing shows the following regressions relative to me previous fixed version. (I got things like cacosh(NaN + I Inf) consistently wrong by classifying NaNs first, while not classifying NaNs first gets things like cacosh(NaN + I large) inconsistently wrong.) float precision: % 1,6c1,6 % old> rcacos:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 3786:214820 % old> rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old> rcasin:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34228:360900 % old> rcasinh:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old> rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836 % old> rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #>=1:0.5 = 4096:420916 % --- % new> rcacos:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.314, #>=1:0.5 = 12032:219512 % new> rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.250, #>=1:0.5 = 1392:3151932 % new> rcasin:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.208, #>=1:0.5 = 38784:366480 % new> rcasinh:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 5104:3155644 % new> rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.287, #>=1:0.5 = 37132:302960 % new> rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.166, #>=1:0.5 = 4096:357284 The huge errors for rcacos = realf(cacosf) etc. are sign errors for NaNs like the following: % x = 0x4b000001 8388609 % y = 0xff800001 nan % rcacos(x, y) = 0xfff8000020000000 0xffc00001 nan % rcacosf(x, y) = 0x7ff8000020000000 0x7fc00001 nan % err = -0x8000000000000000 17179869184.00000 rcacos() correctly preserves the sign of the NaN, but rcacosf() loses it iff |x| > 1 / FLT_EPSILON. All of realf(clog_with_large_values()), realf(clogf()) and hypotf() seem never return a NaN with its sign bit set. This makes some sense for an real absolute value function but not for a complex function. rcacos() loses the sign of the NaN similarly starting at 1 / DBL_EPSILON. The huge errors hide the peak error for 3 of the above 6 functions. For the other 3, the peak error is unchanged and the number of inncorrectly rounded cases is changed a little. % 16,21c16,21 % old> icacos:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old> icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 3786:214820 % old> icasin:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old> icasinh:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34228:360900 % old> icatan:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #>=1:0.5 = 4096:420916 % old> icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836 % --- % new> icacos:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 12528:3163068 % new> icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 4608:212088 % new> icasin:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.250, #>=1:0.5 = 5104:3155644 % new> icasinh:max_er = 0x8000000000000000 17179869184.0000, avg_er = 0.208, #>=1:0.5 = 38784:366480 % new> icatan:max_er = 0x2cef3171 1.4042, avg_er = 0.166, #>=1:0.5 = 4096:357284 % new> icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.287, #>=1:0.5 = 37132:302960 Now double precision: % 31,36c31,36 % old> rcacos:max_er = 0x1b5a 3.4189, avg_er = 0.166, #>=1:0.5 = 2394:118150 % old> rcacosh:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804 % old> rcasin:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33308:99188 % old> rcasinh:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804 % old> rcatan:max_er = 0x1622 2.7666, avg_er = 0.016, #>=1:0.5 = 4680:81364 % old> rcatanh:max_er = 0x14b9 2.5903, avg_er = 0.047, #>=1:0.5 = 428996:691360 % --- % new> rcacos:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.168, #>=1:0.5 = 2458:110404 % new> rcacosh:max_er = 0xf8a 1.9424, avg_er = 0.257, #>=1:0.5 = 24192:2683944 % new> rcasin:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.113, #>=1:0.5 = 33886:100306 % new> rcasinh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24214:2683966 % new> rcatan:max_er = 0x1622 2.7666, avg_er = 0.015, #>=1:0.5 = 20580:101564 % new> rcatanh:max_er = 0x14b9 2.5903, avg_er = 0.044, #>=1:0.5 = 403028:652792 % 39,44c39,44 % old> icacos:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804 % old> icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.166, #>=1:0.5 = 2394:118150 % old> icasin:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8444:2737804 % old> icasinh:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33308:99188 % old> icatan:max_er = 0x14b9 2.5903, avg_er = 0.047, #>=1:0.5 = 428996:691360 % old> icatanh:max_er = 0x1622 2.7666, avg_er = 0.016, #>=1:0.5 = 4680:81364 % --- % new> icacos:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24258:2684010 % new> icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.168, #>=1:0.5 = 2414:110360 % new> icasin:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.257, #>=1:0.5 = 24214:2683966 % new> icasinh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 0.113, #>=1:0.5 = 33886:100306 % new> icatan:max_er = 0x14b9 2.5903, avg_er = 0.044, #>=1:0.5 = 403028:652792 % new> icatanh:max_er = 0x1622 2.7666, avg_er = 0.015, #>=1:0.5 = 20580:101564 Bruce From owner-freebsd-numerics@FreeBSD.ORG Wed Sep 12 10:27:05 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4674E106566B for ; Wed, 12 Sep 2012 10:27:05 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail35.syd.optusnet.com.au (mail35.syd.optusnet.com.au [211.29.133.51]) by mx1.freebsd.org (Postfix) with ESMTP id C3C9E8FC08 for ; Wed, 12 Sep 2012 10:27:04 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail35.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q8CAQtJ1019686 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 12 Sep 2012 20:26:56 +1000 Date: Wed, 12 Sep 2012 20:26:54 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <504D294F.1050901@missouri.edu> Message-ID: <20120912195727.H1078@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <503265E8.3060101@missouri.edu> <5036EFDB.3010304@missouri.edu> <503A9A1A.1050004@missouri.edu> <503ABAC4.3060503@missouri.edu> <20120906200420.H1056@besplex.bde.org> <504D294F.1050901@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Sep 2012 10:27:05 -0000 On Sun, 9 Sep 2012, Stephen Montgomery-Smith wrote: > On 09/06/2012 06:42 AM, Bruce Evans wrote: >> I'm back from a holiday. >> >> On Sun, 26 Aug 2012, Stephen Montgomery-Smith wrote: > >> I tested your latest version and found regressions for NaNs. (Also, >> the float version is less accurate now that it doesn't call double >> functions, but that is a progression.) >> 1. casin*() abuses x and y for |x| and |y| and thus loses the sign of >> NaNs. The quick fix is to reload the original values. The correct >> fix is to use separate variables for the absolute values, as is >> done for cacosh*() and catanh*(). > > I now uniformly use ax and ay for |x| and |y| throughout. > >> 2. NaNs are now filtered out early > > It is not correct to filter out the NaNs early. This is because of the way > expressions like cacosh(NaN + I*INF) are defined in C99. The INFs must be > filtered out first. Oops. I still prefer filtering out NaNs first. They already have a special case for one NaN and one +-0 (with +-0 only special for y in cacos()), and can 4 more for one Nan and one Inf (2 combinations in each of 2 functions) just as easily as the Inf cases can have special subcases for NaNs. Here is the Inf case for casinhf(): % if (ax > 1/FLT_EPSILON || ay > 1/FLT_EPSILON) % if (isinf(x) || isinf(y) || (int)(1+tiny)==1) { % if (signbit(x) == 0) % w = clog_for_large_values(z) + M_LN2; Adding M_LN2_ risks problems for NaNs. These were fatal for cacos*() so we avoid them there. By filtering out NaNs first, we avoid them better. M_LN2 is a double constant, so using it pessimizes the float case (it happens not to break it). % else % w = clog_for_large_values(-z) + M_LN2; -z does arithmetic on a complex value so it risks doing bad things for NaNs. At best, it toggles the sign of a NaN. clog_for_large_values() always clears the sign bit of NaNs, and thus gives inconsistent sign handling to cases which don't go through here. Filtering out NaNs first would avoid these cases. clog_for_large_values() is pessimized a bit for the usual case by having to handle NaNs: @ static float complex @ clog_for_large_values(float complex z) @ { @ float x, y; @ float ax, ay, t; @ @ x = crealf(z); @ y = cimagf(z); @ @ @ if (x != x || y != y) @ return (cpackf(logf(hypotf(x, y)), atan2f(y, x))); This is `if (isnan(x) || isnan(y))' spelled fairly optimally. In the usual case, this will be repeated in the caller, with pessimal spelling for the float and long double cases and different fairly optimal spelling for the double case. By filtering out the NaN cases early, we can avoid repeating this. >> % @@ -289,7 +284,16 @@ >> % * the arguments is not NaN, so we opt not to raise it. >> % */ >> % - return (cpack(x+y, x+y)); >> % + return (cpack(x+0.0L+(y+0), x+0.0L+(y+0))); >> % } > > Why does the second way work, and the first way doesn't? Because the result may depend on the ALU, and only the second way is evaluated in the same ALU in all precisions. Bruce From owner-freebsd-numerics@FreeBSD.ORG Wed Sep 12 13:32:51 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 34F74106564A for ; Wed, 12 Sep 2012 13:32:51 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail34.syd.optusnet.com.au (mail34.syd.optusnet.com.au [211.29.133.218]) by mx1.freebsd.org (Postfix) with ESMTP id 5DA138FC18 for ; Wed, 12 Sep 2012 13:32:50 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail34.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q8CDWbiP014478 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 12 Sep 2012 23:32:41 +1000 Date: Wed, 12 Sep 2012 23:32:34 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bruce Evans In-Reply-To: <20120912191556.F1078@besplex.bde.org> Message-ID: <20120912225847.J1771@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Stephen Montgomery-Smith , freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Sep 2012 13:32:51 -0000 On Wed, 12 Sep 2012, Bruce Evans wrote: > Preliminary testing shows the following regressions relative to me previous > fixed version. (I got things like cacosh(NaN + I Inf) consistently wrong > by classifying NaNs first, while not classifying NaNs first gets things > like cacosh(NaN + I large) inconsistently wrong.) > ... > The huge errors for rcacos = realf(cacosf) etc. are sign errors for NaNs > like the following: > > % x = 0x4b000001 8388609 > % y = 0xff800001 nan > % rcacos(x, y) = 0xfff8000020000000 0xffc00001 nan > % rcacosf(x, y) = 0x7ff8000020000000 0x7fc00001 nan > % err = -0x8000000000000000 17179869184.00000 > > rcacos() correctly preserves the sign of the NaN, but rcacosf() loses it > iff |x| > 1 / FLT_EPSILON. All of realf(clog_with_large_values()), > realf(clogf()) and hypotf() seem never return a NaN with its sign bit set. > This makes some sense for an real absolute value function but not for a > complex function. rcacos() loses the sign of the NaN similarly starting > at 1 / DBL_EPSILON. Fixes for this: % diff -u2 catrig.c~ catrig.c % --- catrig.c~ 2012-09-12 02:43:28.000000000 +0000 % +++ catrig.c 2012-09-12 11:24:55.085541000 +0000 % @@ -31,7 +31,6 @@ % #define ISINF(bx) (((((bx).parts.msw & 0x7fffffff) - 0x7ff00000) | \ % (bx).parts.lsw) == 0) % -#define ISFINITE(bx) (((bx).parts.msw & 0x7ff00000) != 0x7ff00000) Also, remove this again. % % -#define FOUR_SQRT_MIN 1e-150 /* less than 4 / sqrt(DBL_MIN) */ % +#define FOUR_SQRT_MIN 1e-150 /* less than 4 * sqrt(DBL_MIN) */ Also, fix this comment. % #define FOURTH_SQRT_MAX 1e150 /* 1 / FOUR_SQRT_MIN */ % % @@ -281,16 +280,13 @@ % ay = fabs(y); % % - if (ax > 1/DBL_EPSILON || ay > 1/DBL_EPSILON) % - if (ISINF(bx) || ISINF(by) || (int)(1+tiny)==1) { /* raise inexact */ % - if (SIGNBIT(bx) == 0) % - w = clog_for_large_values(z) + M_LN2; % - else % - w = clog_for_large_values(-z) + M_LN2; % - return (cpack(copysign(creal(w), x), copysign(cimag(w), y))); % - } % - Not actually changed -- obfuscated by diff. % if (ISNAN(x) || ISNAN(y)) { % /* casinh(NaN + I*0) = NaN + I*0 */ % if (y == 0) return (cpack(x+x, y)); % + /* casinh(+-Inf + I*NaN) = +-Inf + I*NaN */ % + if (ISINF(bx)) % + return (cpack(x, y+y)); % + /* casinh(NaN + I*+-Inf) = NaN +- I*Inf (sign now optional) */ % + if (ISINF(by)) % + return (cpack(y+y, x)); % /* % * All other cases involving NaN return NaN + I*NaN. Just filter out NaNs (almost) first, and add special cases for Infs combined with NaNs. % @@ -299,7 +295,16 @@ % */ % /* Bruce Evans tells me this is the way to do this: */ % - return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % + return (cpack(x+0.0L+(y+0), x+0.0L+(y+0))); I omitted unnecessary parentheses in the last few versions of it. % } % % + if (ax > 1/DBL_EPSILON || ay > 1/DBL_EPSILON) % + if (ISINF(bx) || ISINF(by) || (int)(1+tiny)==1) { /* raise inexact */ % + if (SIGNBIT(bx) == 0) % + w = clog_for_large_values(z) + M_LN2; % + else % + w = clog_for_large_values(-z) + M_LN2; % + return (cpack(copysign(creal(w), x), copysign(cimag(w), y))); % + } % + % if (ax < DBL_EPSILON && ay < DBL_EPSILON) % if ((int)ax==0 && (int)ay==0) /* raise inexact */ Not actually changed. % @@ -355,17 +360,11 @@ % ay = fabs(y); % % - if (ax > 1/DBL_EPSILON || ay > 1/DBL_EPSILON) % - if (ISINF(bx) || ISINF(by) || (int)(1+tiny)==1) { /* raise inexact */ % - w = clog_for_large_values(z); % - rx = fabs(cimag(w)); % - ry = creal(w) + M_LN2; % - if (sy == 0) % - ry = -ry; % - return (cpack(rx, ry)); % - } % - Not actually changed. % if (ISNAN(x) || ISNAN(y)) { % /* cacos(0 + I*NaN) = PI/2 + I*NaN */ % if (x == 0) return (cpack(M_PI_2, y+y)); % + /* cacosh(+-Inf + I*NaN) = +Inf + I*NaN */ % + /* cacosh(NaN + I*+-Inf) = +Inf + I*NaN */ % + if (ISINF(bx) || ISINF(by)) % + return (cpack(INFINITY, x+y)); % /* % * All other cases involving NaN return NaN + I*NaN. cacosh() is simpler because the sign of the infinity is always plus. The code is suboptimal for both -- once x is nan, it can't be inf, so the code should probably be written as: if (ISNAN(x)) { // special case for y being inf ... } if (ISNAN(y)) { // special case for x being inf ... } % @@ -373,8 +372,17 @@ % * the arguments is not NaN, so we opt not to raise it. % */ % - /* Bruce Evans tells me this is the way to do this: */ % - return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % + return (cpack(x+0.0L+(y+0), x+0.0L+(y+0))); Don't tell everyone the same thing 3 times. % } % % + if (ax > 1/DBL_EPSILON || ay > 1/DBL_EPSILON) % + if (ISINF(bx) || ISINF(by) || (int)(1+tiny)==1) { /* raise inexact */ % + w = clog_for_large_values(z); % + rx = fabs(cimag(w)); % + ry = creal(w) + M_LN2; % + if (sy == 0) % + ry = -ry; % + return (cpack(rx, ry)); % + } % + % if (ax < DBL_EPSILON && ay < DBL_EPSILON) % if ((int)ax==0 && (int)ay==0) { /* raise inexact */ Not really changed. % @@ -423,9 +431,4 @@ % x = creal(z); % y = cimag(z); % - % - /* Handle NaNs using the general formula to mix them right. */ % - if (x != x || y != y) % - return (cpack(log(hypot(x, y)), atan2(y, x))); % - % ax = fabs(x); % ay = fabs(y); Not needed now. % @@ -558,6 +561,5 @@ % * the arguments is not NaN, so we opt not to raise it. % */ % - /* Bruce Evans tells me this is the way to do this: */ % - return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % + return (cpack(x+0.0L+(y+0), x+0.0L+(y+0))); % } % % diff -u2 catrigf.c~ catrigf.c % --- catrigf.c~ 2012-09-12 02:43:45.000000000 +0000 % +++ catrigf.c 2012-09-12 12:32:19.157537000 +0000 % @@ -25,9 +25,14 @@ % */ % % -#define FOUR_SQRT_MIN 1e-17F % -#define FOURTH_SQRT_MAX 1e17F % - % static const float % -tiny = 1e-15; % +FOUR_SQRT_MIN = 0x1p-61, % +FOURTH_SQRT_MAX = 0x1p61, % +M_E = 2.7182818285e0, /* 0xadf854.0p-22 */ % +M_LN2 = 6.9314718056e-1, /* 0xb17218.0p-24 */ % +M_PI_2 = 1.5707963268e0, /* 0xc90fdb.0p-23 */ % +half = 0.5, % +quarter = 0.25; % +static const volatile % +tiny = 0x1p-100; % % /* Use normal fdlibm style of consts instead of macros, and fix the type of most of the constants. Use hex constants for things related to powers of 2. Use a more normal value for `tiny'. fdlibm normally uses (float)1e-30 for the float tiny, so that squaring it underflows to 0, and then sometimes uses the same variable for setting inexact. Here we only use it for setting inexact, and the value of 1e-15 makes no sense for just that. FLT_EPSILON/2 would make sense for just that (adding it to 1). But use the tinier `tiny', in a hex form (the magic -100 is somehwhat less than FLT_MIN_EXP). In existing code `tiny' in hex is used mainly in long double code. 2**-100 is used in e_expf.c but is not spelled `tiny' there. Make `tiny' volatile as usual, to avoid bugs in clang. % @@ -46,4 +51,7 @@ % #include % #include % +#undef M_E % +#undef M_LN2 % +#undef M_PI_2 % #include "math_private.h" % The constants should all be in lower case, and nothing in should be used in any precision, so that namespace pollution in doesn't matter. % @@ -56,7 +64,7 @@ % f(float a, float b, float hypot_a_b) % { % - if (b < 0) return (0.5 * (hypot_a_b - b)); % - if (b == 0) return (0.5*a); % - return (0.5 * a*a / (hypot_a_b + b)); % + if (b < 0) return (half * (hypot_a_b - b)); % + if (b == 0) return (half*a); % + return (half * a*a / (hypot_a_b + b)); % } % Global substitution of 0.5 by `half' gives large diffs. % @@ -71,5 +79,5 @@ % % % - A = 0.5*(R + S); % + A = half*(R + S); % if (A < 1) A = 1; % % @@ -106,5 +114,5 @@ % % % - if (*B > 0.6417) { % + if (*B > 0.6417F) { % *B_is_usable = 0; % if (x >= FLT_EPSILON * fabsf(y-1) && x >= FOUR_SQRT_MIN) { Another double constant. Putting `half' in the constant table gives the same spelling for it in all precisions. This magic number could go there too, but I don't know a good name for it. % @@ -113,5 +121,5 @@ % } else if (y == 1) { % if ((int)x==0) % - *sqrt_A2my2 = sqrtf(x)*sqrtf(0.5*(A+y)); % + *sqrt_A2my2 = sqrtf(x)*sqrtf(half*(A+y)); % } else if (y > 1) { % if ((int)x==0) % @@ -137,4 +145,14 @@ % ay = fabsf(y); % % + if (isnan(x) || isnan(y)) { % + % + if (y == 0) return (cpackf(x+x, y)); % + if (isinf(x)) return (cpackf(x, y+y)); % + if (isinf(y)) % + return (cpackf(y+y, x)); % + % + return (cpackf(x+0.0L+(y+0), x+0.0L+(y+0))); % + } % + % if (ax > 1/FLT_EPSILON || ay > 1/FLT_EPSILON) % if (isinf(x) || isinf(y) || (int)(1+tiny)==1) { % @@ -146,11 +164,4 @@ % } % % - if (isnan(x) || isnan(y)) { % - % - if (y == 0) return (cpackf(x+x, y)); % - % - return (cpackf((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % - } % - % if (ax < FLT_EPSILON && ay < FLT_EPSILON) % if ((int)ax==0 && (int)ay==0) % @@ -187,4 +198,13 @@ % ay = fabsf(y); % % + if (isnan(x) || isnan(y)) { % + % + if (x == 0) return (cpackf(M_PI_2, y+y)); % + if (isinf(x) || isinf(y)) % + return (cpackf(INFINITY, x+y)); % + % + return (cpackf(x+0.0L+(y+0), x+0.0L+(y+0))); % + } % + % if (ax > 1/FLT_EPSILON || ay > 1/FLT_EPSILON) % if (isinf(x) || isinf(y) || (int)(1+tiny)==1) { % @@ -197,11 +217,4 @@ % } % % - if (isnan(x) || isnan(y)) { % - % - if (x == 0) return (cpackf(M_PI_2, y+y)); % - % - return (cpackf((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % - } % - % if (ax < FLT_EPSILON && ay < FLT_EPSILON) % if ((int)ax==0 && (int)ay==0) { % @@ -242,9 +255,4 @@ % x = crealf(z); % y = cimagf(z); % - % - % - if (x != x || y != y) % - return (cpackf(logf(hypotf(x, y)), atan2f(y, x))); % - % ax = fabsf(x); % ay = fabsf(y); % @@ -255,5 +263,5 @@ % } % % - if (ax > 0.5*FLT_MAX) % + if (ax > half*FLT_MAX) % return (cpackf(logf(hypotf(x / M_E, y / M_E)) + 1, atan2f(y, x))); % % @@ -261,5 +269,5 @@ % return (cpackf(logf(hypotf(x, y)), atan2f(y, x))); % % - return (cpackf(logf(ax*ax + ay*ay) * 0.5, atan2f(y, x))); % + return (cpackf(logf(ax*ax + ay*ay) * half, atan2f(y, x))); % } % % @@ -317,5 +325,5 @@ % return (cpackf(x, y+y)); % % - return (cpackf((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % + return (cpackf(x+0.0L+(y+0), x+0.0L+(y+0))); % } % % @@ -331,10 +339,10 @@ % if ((int)ay==0) { % if ( ilogbf(ay) > FLT_MIN_EXP) % - rx = - 0.5 * logf(0.5*ay); % + rx = - half * logf(half*ay); % else % - rx = - 0.5 * (logf(ay) - M_LN2); % + rx = - half * (logf(ay) - M_LN2); % } % } else % - rx = 0.25 * log1pf(4*ax / sum_squares(ax-1, ay)); % + rx = quarter * log1pf(4*ax / sum_squares(ax-1, ay)); % % if (ax == 1) { % @@ -342,10 +350,10 @@ % ry = 0; % else % - ry = 0.5 * atan2f(2, -ay); % + ry = half * atan2f(2, -ay); % } else if (ay < FOUR_SQRT_MIN) { % if ((int)ay==0) % - ry = 0.5 * atan2f(2*ay, (1-ax)*(1+ax)); % + ry = half * atan2f(2*ay, (1-ax)*(1+ax)); % } else % - ry = 0.5 * atan2f(2*ay, (1-ax)*(1+ax) - ay*ay); % + ry = half * atan2f(2*ay, (1-ax)*(1+ax) - ay*ay); % % return (cpackf(copysignf(rx, x), copysignf(ry, y))); % diff -u2 catrigl.c~ catrigl.c % --- catrigl.c~ 2012-09-12 02:43:43.000000000 +0000 % +++ catrigl.c 2012-09-12 11:08:34.955652000 +0000 % @@ -141,4 +141,15 @@ % ay = fabsl(y); % % + if (isnan(x) || isnan(y)) { % + % + if (y == 0) return (cpackl(x+x, y)); % + if (isinf(x)) % + return (cpackl(x, y+y)); % + if (isinf(y)) % + return (cpackl(y+y, x)); % + % + return (cpackl(x+0.0L+(y+0), x+0.0L+(y+0))); % + } % + % if (ax > 1/LDBL_EPSILON || ay > 1/LDBL_EPSILON) % if (isinf(x) || isinf(y) || (int)(1+tiny)==1) { % @@ -150,11 +161,4 @@ % } % % - if (isnan(x) || isnan(y)) { % - % - if (y == 0) return (cpackl(x+x, y)); % - % - return (cpackl((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % - } % - % if (ax < LDBL_EPSILON && ay < LDBL_EPSILON) % if ((int)ax==0 && (int)ay==0) % @@ -191,4 +195,13 @@ % ay = fabsl(y); % % + if (isnan(x) || isnan(y)) { % + % + if (x == 0) return (cpackl(L_PI_2, y+y)); % + if (isinf(x) || isinf(y)) % + return (cpackl(INFINITY, x+y)); % + % + return (cpackl(x+0.0L+(y+0), x+0.0L+(y+0))); % + } % + % if (ax > 1/LDBL_EPSILON || ay > 1/LDBL_EPSILON) % if (isinf(x) || isinf(y) || (int)(1+tiny)==1) { % @@ -201,11 +214,4 @@ % } % % - if (isnan(x) || isnan(y)) { % - % - if (x == 0) return (cpackl(L_PI_2, y+y)); % - % - return (cpackl((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % - } % - % if (ax < LDBL_EPSILON && ay < LDBL_EPSILON) % if ((int)ax==0 && (int)ay==0) { % @@ -246,9 +252,4 @@ % x = creall(z); % y = cimagl(z); % - % - % - if (x != x || y != y) % - return (cpackl(logl(hypotl(x, y)), atan2l(y, x))); % - % ax = fabsl(x); % ay = fabsl(y); % @@ -321,5 +322,5 @@ % return (cpackl(x, y+y)); % % - return (cpackl((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % + return (cpackl(x+0.0L+(y+0), x+0.0L+(y+0))); % } % The errors have now changed only slightly. This is only for i386: Float precision: % 1,6c1,6 % old< rcacos:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #>=1:0.5 = 3786:214820 % old< rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old< rcasin:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34228:360900 % old< rcasinh:max_er = 0x3170e232 1.5450, avg_er = 0.253, #>=1:0.5 = 1196:3225128 % old< rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836 % old< rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #>=1:0.5 = 4096:420916 % --- % new> rcacos:max_er = 0x4517ee94 2.1592, avg_er = 0.315, #>=1:0.5 = 4608:247032 % new> rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.250, #>=1:0.5 = 1392:3151892 % new> rcasin:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 35072:362768 % new> rcasinh:max_er = 0x3170e232 1.5450, avg_er = 0.250, #>=1:0.5 = 1392:3151892 % new> rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.287, #>=1:0.5 = 37132:302960 % new> rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.166, #>=1:0.5 = 4096:357284 The number of incorrectly rounded cases went up most for catan. Fixing the constants actually reduced the errors slightly -- the ones that are now 1392:* (up from 1196:*) were over 30000:* in an intermediate version. % 16,21c16,21 % old< icacos:max_er = 0x3170e232 1.5450, avg_er = 0.253, #new>=1:0.5 = 1196:3225128 % old< icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.314, #new>=1:0.5 = 3786:214820 % old< icasin:max_er = 0x3170e232 1.5450, avg_er = 0.253, #new>=1:0.5 = 1196:3225128 % old< icasinh:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #new>=1:0.5 = 34228:360900 % old< icatan:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #new>=1:0.5 = 4096:420916 % old< icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #new>=1:0.5 = 13260:190836 % --- % new> icacos:max_er = 0x3170e232 1.5450, avg_er = 0.250, #>=1:0.5 = 1392:3151892 % new> icacosh:max_er = 0x4517ee94 2.1592, avg_er = 0.315, #>=1:0.5 = 4608:247032 % new> icasin:max_er = 0x3170e232 1.5450, avg_er = 0.250, #>=1:0.5 = 1392:3151892 % new> icasinh:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 35072:362768 % new> icatan:max_er = 0x2cef3171 1.4042, avg_er = 0.166, #>=1:0.5 = 4096:357284 % new> icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.287, #>=1:0.5 = 37132:302960 % % Double precision: % 31,36c31,36 % old< rcacos:max_er = 0x1b5a 3.4189, avg_er = 0.166, #new>=1:0.5 = 2394:118150 % old< rcacosh:max_er = 0xf8a 1.9424, avg_er = 0.258, #new>=1:0.5 = 8444:2737804 % old< rcasin:max_er = 0x15c5 2.7212, avg_er = 0.113, #new>=1:0.5 = 33308:99188 % old< rcasinh:max_er = 0xf8a 1.9424, avg_er = 0.258, #new>=1:0.5 = 8444:2737804 % old< rcatan:max_er = 0x1622 2.7666, avg_er = 0.016, #new>=1:0.5 = 4680:81364 % old< rcatanh:max_er = 0x14b9 2.5903, avg_er = 0.047, #new>=1:0.5 = 428996:691360 % --- % new> rcacos:max_er = 0x1b5a 3.4189, avg_er = 0.168, #>=1:0.5 = 2414:110360 % new> rcacosh:max_er = 0xf8a 1.9424, avg_er = 0.257, #>=1:0.5 = 24192:2683944 % new> rcasin:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33864:100284 % new> rcasinh:max_er = 0xf8a 1.9424, avg_er = 0.257, #>=1:0.5 = 24192:2683944 % new> rcatan:max_er = 0x1622 2.7666, avg_er = 0.015, #>=1:0.5 = 20580:101564 % new> rcatanh:max_er = 0x14b9 2.5903, avg_er = 0.044, #>=1:0.5 = 403028:652792 % 39,44c39,44 % old< icacos:max_er = 0xf8a 1.9424, avg_er = 0.258, #new>=1:0.5 = 8444:2737804 % old< icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.166, #new>=1:0.5 = 2394:118150 % old< icasin:max_er = 0xf8a 1.9424, avg_er = 0.258, #new>=1:0.5 = 8444:2737804 % old< icasinh:max_er = 0x15c5 2.7212, avg_er = 0.113, #new>=1:0.5 = 33308:99188 % old< icatan:max_er = 0x14b9 2.5903, avg_er = 0.047, #new>=1:0.5 = 428996:691360 % old< icatanh:max_er = 0x1622 2.7666, avg_er = 0.016, #new>=1:0.5 = 4680:81364 % --- % new> icacos:max_er = 0xf8a 1.9424, avg_er = 0.257, #>=1:0.5 = 24192:2683944 % new> icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.168, #>=1:0.5 = 2414:110360 % new> icasin:max_er = 0xf8a 1.9424, avg_er = 0.257, #>=1:0.5 = 24192:2683944 % new> icasinh:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33864:100284 % new> icatan:max_er = 0x14b9 2.5903, avg_er = 0.044, #>=1:0.5 = 403028:652792 % new> icatanh:max_er = 0x1622 2.7666, avg_er = 0.015, #>=1:0.5 = 20580:101564 Bruce From owner-freebsd-numerics@FreeBSD.ORG Wed Sep 12 23:31:14 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9D8AE1065674 for ; Wed, 12 Sep 2012 23:31:14 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 543348FC1D for ; Wed, 12 Sep 2012 23:31:13 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q8CNVCDW068140; Wed, 12 Sep 2012 18:31:12 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50511B40.3070009@missouri.edu> Date: Wed, 12 Sep 2012 18:31:12 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> In-Reply-To: <20120912225847.J1771@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Sep 2012 23:31:14 -0000 OK, I believe that I have implemented all your suggested changes - but with style changes that I prefer. Test it now, and see if it works. As for formatting problems caused by the perl scripts, I would anticipate that by the time it gets committed, that people would no longer use the perl scripts. But it is a realyl convenient way to make big changes to catrig.c, and then copy these same changes to catrigf.c and catrigl.c. As far as I can see, the only formatting mess ups are multiple double spaced blank lines. I think I have got rid of them now. From owner-freebsd-numerics@FreeBSD.ORG Thu Sep 13 12:55:10 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CF8D6106566B for ; Thu, 13 Sep 2012 12:55:10 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by mx1.freebsd.org (Postfix) with ESMTP id ECD738FC0A for ; Thu, 13 Sep 2012 12:55:09 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q8DCt04j013853 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 13 Sep 2012 22:55:02 +1000 Date: Thu, 13 Sep 2012 22:55:00 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50511B40.3070009@missouri.edu> Message-ID: <20120913204808.T1964@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Sep 2012 12:55:10 -0000 On Wed, 12 Sep 2012, Stephen Montgomery-Smith wrote: > OK, I believe that I have implemented all your suggested changes - but with > style changes that I prefer. > > Test it now, and see if it works. Yes, it mostly works now (most of my last round of NaN fixes were consistently wrong, since I applied the spec for cacosh() to cacos()...). > As for formatting problems caused by the perl scripts, I would anticipate > that by the time it gets committed, that people would no longer use the perl > scripts. But it is a realyl convenient way to make big changes to catrig.c, > and then copy these same changes to catrigf.c and catrigl.c. Yes, but I find them not so convenient for small changes. I didn't actually use them, but edited all versions manually, with somehwhat different changes in each. > As far as I can see, the only formatting mess ups are multiple double spaced > blank lines. I think I have got rid of them now. There are still single double spaced blank lines. The following set of changes starts fixing long doubles for i386, and has 1 fix and many cosmetic changes related to the previous round of fixes. % diff -u2 catrig.c~ catrig.c % --- catrig.c~ 2012-09-13 01:34:25.000000000 +0000 % +++ catrig.c 2012-09-13 10:40:46.760362000 +0000 % @@ -30,19 +30,18 @@ % (bx).parts.lsw) == 0) % % -static const volatile double % -tiny = 0x1p-2000; % - This was broken for doubles (it underflowed to 0). % /* We need that DBL_EPSILON^2 is larger than FOUR_SQRT_MIN. */ % static const double % +A_crossover = 10, /* Hull et al suggest 1.5, but 10 works better */ % +B_crossover = 0.6417, /* suggested by Hull et al */ Sort the declarations. Don't use type suffixes if possible (for 0.6417 here). % FOUR_SQRT_MIN = 0x1p-500, /* less than 4 * sqrt(DBL_MIN) */ % FOURTH_SQRT_MAX = 0x1p500, /* 1 / FOUR_SQRT_MIN */ Oops, this patch is missing my change of these to match the comments. 0x1p-500 is very far from matching the comment which says that FOUR_SQRT_MIN is _less_ than 4 * sqrt(DBL_MIN). DBL_MIN_EXP is -1021, so DBL_MIN is about 2**-1022 and 4 * sqrt(DBL_MIN) is about 4 * 2**-511 = 2**-509. Your FOUR_SQRT_MIN is about 512 times larger than that. Changing the 500's to 509's made no difference in my test, but the corresponding change for float precision (restore my 61's from your 60's) gave a small improvevement. % half = 0.5, I thought that we were going to change all the multiplications by 0.5 by divisions by 2. Then we would not need this. % -quarter = 0.25, Similarly for this. Similarly for inverses. You already use lots of 1/DBL_EPSILONs, and this is efficient soince it can be evaluated at compile time even by non-broken compilers, since DBL_EPSILON is a power of 2. Now that FOUR_SQRT_MIN is a power of 2, its inverse FOURTH_SQRT_MAX can also be spelled as an inverse. Except it is not actually an inverse like its comment claims. DBL_MAX is not actually the inverse of DBL_MIN; it is just that within one or 2 powers of 2 in supported formats, and maybe we should worry about other formats in which it is more different -- the different name for FOURTH_SQRT_MAX supports that, and the comment shouldn't say that it is the inverse of FOUR_SQRT_MIN. % -A_crossover = 10, /* Hull et al suggest 1.5, but 10 works better. */ % -B_crossover = 0.6417L, /* suggested by Hull et al. */ % -m_pi_2 = 1.5707963267948966192313216916397514420985846996876L, % -m_e = 2.7182818284590452353602874713526624977572470937000L, % -m_ln2 = 0.69314718055994530941723212145817656807550013436026L; % +m_e = 2.7182818284590452e0, /* 0x15bf0a8b145769.0p-51 */ % +m_ln2 = 6.9314718055994531e-1, /* 0x162e42fefa39ef.0p-53 */ % +m_pi_2 = 1.5707963267948966e0, /* 0x1921fb54442d18.0p-52 */ I prefer my style for the declarations (except possibly for the whitespace around '='. I have formatting routines which print the constants in decimal and hex using the correct number of digits and formats everything including initializers and comments fairly well. Automatic generation is essential for larger tables, but I don't have it set up for these particular constants, and just manually formatted the above after printing just the constants using my pari formatting routines. % +quarter = 0.25; Sort this. % % +static const volatile double % +tiny = 0x1p-1000; Sort this. % % #include % @@ -280,8 +279,8 @@ % % if (ISNAN(x) || ISNAN(y)) { % - /* casinh(inf + I*NaN) = inf + I*NaN */ % + /* casinh(+-Inf + I*NaN) = +-Inf + I*NaN */ I don't like spelling "infinity" as "inf", and started fixing this. I prefer my comments giving details on the signs of the Infs. The sign rules follow from odd/even and/or conjugacy rules, but I find these confusing in the special cases for Infs and NaNs. I refrained from changing this to give the details of the transformation of the NaN are. Here they are that the result NaN is the default result of operating on the (single) arg NaN. In s_ccosh.c, the result NaN for this is written as d(NaN), where NaN is implicitly the arg NaN. % if (ISINF(bx)) % return (cpack(x, y+y)); % - /* casinh(NaN + I*inf) = inf + I*NaN */ % + /* casinh(NaN + I*+-Inf) = +-Inf + I*NaN (sign optional) */ % if (ISINF(by)) % return (cpack(y, x+x)); Now the handling of the sign of the infinity is subtly different. The comment says that we can choose any sign, but not the choice. The code chooses the same sign as the arg. Now I don't see why C99 allows either sign. C99 also requires casinh() to be odd, so the sign should toggle with the sign of y -- although the oddness rule doesn't applys to NaNs, this case specifically ignores the whole NaN including its sign when creating the infinity for the real part. % @@ -360,10 +359,10 @@ % % if (ISNAN(x) || ISNAN(y)) { % - /* cacos(inf + I*NaN) = NaN + I*inf */ % - /* cacos(NaN + I*inf) = NaN + I*inf */ % + /* cacos(+-Inf + I*NaN) = NaN + I*-+Inf (sign optional) */ % if (ISINF(bx)) % - return (cpack(x+y, -x)); % + return (cpackf(y+y, -INFINITY)); Use the logically correct expression for the NaN part of the result, like we do elsewhere. Now the handling of the sign of the infinity is not so subtly different. Now there is only a congugacy rule, not an oddness rule. We are in the case of a NaN y. Conjugating a NaN doesn't really change it (it will in fact change its sign depending on whether it is implemented as -y or 0-y). So always return infinity with a fixed sign (-Inf). If it is to depend on the sign of an arg, then it should be on the sign of the NaN y, since for non-NaN y, the sign of x has no effect on the sign of the resulting imaginary part. % + /* cacos(NaN + I*+-Inf) = NaN + I*-+Inf */ % if (ISINF(by)) % - return (cpack(x+y, -y)); % + return (cpackf(x+x, -y)); Use the logically correct expression for the NaN part of the result, as above. Comment changes as above. % /* cacos(0 + I*NaN) = PI/2 + I*NaN */ % if (x == 0) return (cpack(m_pi_2, y+y)); % @@ -416,11 +415,20 @@ % cacosh(double complex z) % { % - double complex w = cacos(z); % - double rx = creal(w); % - double ry = cimag(w); % + double complex w; % + double rx; % + double ry; % + % + w = cacos(z); % + rx = creal(w); % + ry = cimag(w); % + Start fixing the largest style bugs. % + /* cacosh(NaN + I*NaN) = NaN + I*NaN */ % if (ISNAN(rx) && ISNAN(ry)) % return (cpack(ry, rx)); % + /* cacosh(NaN + I*+-Inf) = +Inf + I*NaN */ % + /* cacosh(+-Inf + I*NaN) = +Inf + I*NaN */ % if (ISNAN(rx)) % return (cpack(fabs(ry), rx)); % + /* cacosh(0 + I*NaN) = PI/2 + I*NaN */ % if (ISNAN(ry)) % return (cpack(ry, copysign(rx, cimag(z)))); The sign rules are especially confusing here, since we look at NaNs in the result. I forgot to expand these comments further, by saying exactly which combinations of NaNs with finite rx and ry occur. I would prefer to classify all the special cases from the args. % @@ -429,6 +437,5 @@ % % /* % - * This clog is a stop-gap function. It is designed to work well only if % - * |z| is larger than 1/DBL_EPSILON. % + * Optimized version of clog() for |z| finite and larger than ~1/DBL_EPSILON. % */ % static double complex Start cleaning this up. Mainly just change its comments to match its code for now. % @@ -440,5 +447,4 @@ % x = creal(z); % y = cimag(z); % - % ax = fabs(x); % ay = fabs(y); % @@ -450,9 +456,10 @@ % % /* % - * To avoid unnecessary overflow, if x or y are very large, divide x % + * Avoid overflow in hypot() when x and y are both very large. % + * Divide x % * and y by E, and then add 1 to the logarithm. This depends on % * E being larger than sqrt(2). % - * There is a potential loss of accuracy caused by dividing by E, % - * but this case should happen extremely rarely. % + * Dividing by E causes an insignificant loss of accuracy; however % + * this method is still poor since it is uneccessarily slow. % */ % if (ax > half*DBL_MAX) Multiplying by 1/E would be much better, but I want to get rid of the m_e constant. I tried dividing by 2 and adding m_ln2, but this lost accuracy unnecessarily. A more complete conversion to the methods in clog() is needed. % @@ -460,8 +467,8 @@ % % /* % - * Because atan2 and hypot conform to C99, this also covers all the % - * edge cases when x or y are 0 or infinite. % + * Avoid overflow when x or y is large. Avoid underflow when x or % + * y is small. % */ % - if (ax < FOUR_SQRT_MIN || ay < FOUR_SQRT_MIN || ax > FOURTH_SQRT_MAX || ay > FOURTH_SQRT_MAX) % + if (ax > FOURTH_SQRT_MAX || ay < FOUR_SQRT_MIN) % return (cpack(log(hypot(x, y)), atan2(y, x))); % Minor optimization. % @@ -480,5 +487,5 @@ % % /* % - * sum_squares(x,y) = x*x + y*y. % + * sum_squares(x,y) = x*x + y*y (or just x*x if y*y would underflow). % * Assumes x*x and y*y will not overflow. % * Assumes x and y are finite. % @@ -488,12 +495,9 @@ % sum_squares(double x, double y) % { % - /* % - * The following code seems to do better than % - * return (hypot(x, y) * hypot(x, y)); % - */ % - Of course we can do better than using hypot(). % + /* Avoid underflow when y is small. */ % if (fabs(y) < FOUR_SQRT_MIN) The correct cutoff for this is now (more) obviously sqrt(DBL_MIN). Similarly in clog_for_large_values(). The sloppy cutoffs are just minor pessimizations. % if ((int)y==0) /* raise inexact */ % return (x*x); % + % return (x*x + y*y); % } % diff -u2 catrigf.c~ catrigf.c % --- catrigf.c~ 2012-09-13 01:34:33.000000000 +0000 % +++ catrigf.c 2012-09-13 10:30:41.284444000 +0000 % @@ -35,17 +35,17 @@ % */ % % -static const volatile float % -tiny = 0x1p-100; % - % static const float % -FOUR_SQRT_MIN = 0x1p-60, % -FOURTH_SQRT_MAX = 0x1p60, % -half = 0.5, % -quarter = 0.25, % A_crossover = 10, % -B_crossover = 0.6417L, % -m_pi_2 = 1.5707963267948966192313216916397514420985846996876L, % -m_e = 2.7182818284590452353602874713526624977572470937000L, % -m_ln2 = 0.69314718055994530941723212145817656807550013436026L; % +B_crossover = 0.6417, % +FOUR_SQRT_MIN = 0x1p-61, % +FOURTH_SQRT_MAX = 0x1p61, % +half = 0.5, % +m_e = 2.7182818285e0, /* 0xadf854.0p-22 */ % +m_ln2 = 6.9314718056e-1, /* 0xb17218.0p-24 */ % +m_pi_2 = 1.5707963268e0, /* 0xc90fdb.0p-23 */ % +quarter = 0.25; % + % +static const volatile float % +tiny = 0x1p-100; % % #include % @@ -196,7 +196,7 @@ % % if (isinf(x)) % - return (cpackf(x+y, -x)); % + return (cpackf(y+y, -INFINITY)); % if (isinf(y)) % - return (cpackf(x+y, -y)); % + return (cpackf(x+x, -y)); % % if (x == 0) return (cpackf(m_pi_2, y+y)); % @@ -240,7 +240,11 @@ % cacoshf(float complex z) % { % - float complex w = cacosf(z); % - float rx = crealf(w); % - float ry = cimagf(w); % + float complex w; % + float rx, ry; % + % + w = cacosf(z); % + rx = crealf(w); % + ry = cimagf(w); % + % if (isnan(rx) && isnan(ry)) % return (cpackf(ry, rx)); % @@ -260,5 +264,4 @@ % x = crealf(z); % y = cimagf(z); % - % ax = fabsf(x); % ay = fabsf(y); Changes as in the double version. I didn't do them all here. % diff -u2 catrigl.c~ catrigl.c % --- catrigl.c~ 2012-09-13 01:34:30.000000000 +0000 % +++ catrigl.c 2012-09-13 09:54:32.886453000 +0000 % @@ -35,22 +35,38 @@ % */ % % -static const volatile long double % -tiny = 0x1p-16000L; % +#include % +#include % + % +#include "fpmath.h" % +#include "math.h" % +#include "math_private.h" The #includes have to be first so that LD80C() can be used in the constant table. We could also use M_E, etc., in the constant table for other precisions after moving the table after the includes. % % static const long double % -FOUR_SQRT_MIN = 0x1p-8000L, % -FOURTH_SQRT_MAX = 0x1p8000L, Change these to match the comment. Made no obvious difference. % -half = 0.5, % -quarter = 0.25, % A_crossover = 10, % -B_crossover = 0.6417L, Sort things. Some should be named differently so that they sort better or aren't so verbose. E.g., the m_ prefixes are vestiges of the M_ prefixes, which were for math.h but are not needed here and make some of the names more prefixed than others. fdlibm mostly spells Pi/2 as pio2 or pi_o_2, but we spell it as m_pi_2 here. % -m_pi_2 = 1.5707963267948966192313216916397514420985846996876L, % -m_e = 2.7182818284590452353602874713526624977572470937000L, % -m_ln2 = 0.69314718055994530941723212145817656807550013436026L; % +B_crossover = 0.6417, % +FOUR_SQRT_MIN = 0x1p-8189L, % +FOURTH_SQRT_MAX = 0x1p8189L, "FOURTH SQUARE [ROOT]" is confusing (it sounds like a 4th or 8th root, or maybe a 4th power of a squre root). Do you want to keep this name matching a name in the paper? If not, "quarter_sqrt_max" might be better. Or use the pio2 naming scheme and name it sqrt_max_o4. "FOUR" is a verbose spelling of "4", and is used mainly since identifiers can't start with "4". % +half = 0.5, % +quarter = 0.25; % % -#include % -#include % -#include % -#include "math_private.h" % +#if LDBL_MANT_DIG == 64 % +static const union IEEEl2bits % +um_e = LD80C(0xadf85458a2bb4a9b, 1, 0, 2.71828182845904523536e0L), % +um_ln2 = LD80C(0xb17217f7d1cf79ac, -1, 0, 6.93147180559945309417e-1L), % +um_pi_2 = LD80C(0xc90fdaa22168c235, 0, 0, 1.57079632679489661923e0L); % +#define m_e um_e.e % +#define m_ln2 um_ln2.e % +#define m_pi_2 um_pi_2.e % +#elif LDBL_MANT_DIG == 113 % +static const long double % +m_e = 2.71828182845904523536028747135266250e0L, /* 0x15bf0a8b1457695355fb8ac404e7a.0p-111 */ % +m_ln2 = 6.93147180559945309417232121458176568e-1L, /* 0x162e42fefa39ef35793c7673007e6.0p-113 */ % +m_pi_2 = 1.57079632679489661923132169163975144e0L; /* 0x1921fb54442d18469898cc51701b8.0p-112 */ % +#else % +#error "Unsupported long double format" % +#endif Long double constants are painful to declare correctly. First, they can be any length. Second, on i386 gcc truncates them to double precision (but long double exponent range). The LD80C macro hides most of the details of working around the latter. After that, it gives only minor messes to ifdef declare the hex values in comments and to ifdef for the unsupported case. % + % +static const volatile long double % +tiny = 0x1p-10000L; % % static long double complex clog_for_large_values(long double complex z); % @@ -196,7 +212,7 @@ % % if (isinf(x)) % - return (cpackl(x+y, -x)); % + return (cpackl(y+y, -INFINITY)); % if (isinf(y)) % - return (cpackl(x+y, -y)); % + return (cpackl(x+x, -y)); % % if (x == 0) return (cpackl(m_pi_2, y+y)); % @@ -240,7 +256,12 @@ % cacoshl(long double complex z) % { % - long double complex w = cacosl(z); % - long double rx = creall(w); % - long double ry = cimagl(w); % + long double complex w; % + long double rx; % + long double ry; % + % + w = cacosl(z); % + rx = creall(w); % + ry = cimagl(w); % + % if (isnan(rx) && isnan(ry)) % return (cpackl(ry, rx)); Other changes as in the double version. I didn't do them all here. Error table for complex functions, with more tests than before (takes a day or two to run on 1 CPU): % amd64 float prec, on 2**16 x 2**16 args: % rcacos:max_er = 0x70f82558 3.5303, avg_er = 0.327, #>=1:0.5 = 12113072:86215320 % rcacosh:max_er = 0x642421ce 3.1294, avg_er = 0.262, #>=1:0.5 = 4492356:851070980 % rcasin:max_er = 0x6aaaeb38 3.3334, avg_er = 0.225, #>=1:0.5 = 13545196:105163176 % rcasinh:max_er = 0x642421ce 3.1294, avg_er = 0.262, #>=1:0.5 = 4492356:851070980 % rcatan:max_er = 0x65573b1a 3.1669, avg_er = 0.291, #>=1:0.5 = 11857524:92253376 % rcatanh:max_er = 0x5994bb70 2.7994, avg_er = 0.188, #>=1:0.5 = 42972472:357234756 % rclog: max_er = 0x305820c5 1.5108, avg_er = 0.247, #>=1:0.5 = 106660:32521284 % rcsqrt:max_er = 0x1fffffd0 1.0000, avg_er = 0.239, #>=1:0.5 = 0:27082 % rccos: max_er = 0x55b98673 2.6789, avg_er = 0.090, #>=1:0.5 = 15162940:82894084 % rccosh:max_er = 0x55b98673 2.6789, avg_er = 0.090, #>=1:0.5 = 15162940:82894084 % rcexp: max_er = 0x1658ca3e2ce252 11716177.9430, avg_er = 0.197, #>=1:0.5 = 15750254:115895992 Everything except this works fairly well. cexp() passed tests on only 2**12 x 2**12 args. % rcsin: max_er = 0x56838440 2.7036, avg_er = 0.095, #>=1:0.5 = 21624908:122655056 % rcsinh:max_er = 0x52a0d11e 2.5821, avg_er = 0.105, #>=1:0.5 = 14563920:304694328 % rctan: max_er = 0xc1c16ec2 6.0549, avg_er = 0.105, #>=1:0.5 = 38983948:270561304 % rctanh:max_er = 0xc89c698a 6.2691, avg_er = 0.165, #>=1:0.5 = 176171768:584759008 ctan() isn't too good. Its max error expanded from ~5 ulps to ~6 ulps with more tests. % icacos:max_er = 0x642421ce 3.1294, avg_er = 0.262, #>=1:0.5 = 4492356:851070980 % icacosh:max_er = 0x70f82558 3.5303, avg_er = 0.327, #>=1:0.5 = 12113072:86215320 % icasin:max_er = 0x642421ce 3.1294, avg_er = 0.262, #>=1:0.5 = 4492356:851070980 % icasinh:max_er = 0x6aaaeb38 3.3334, avg_er = 0.225, #>=1:0.5 = 13545196:105163176 % icatan:max_er = 0x5994bb70 2.7994, avg_er = 0.188, #>=1:0.5 = 42972472:357234756 % icatanh:max_er = 0x65573b1a 3.1669, avg_er = 0.291, #>=1:0.5 = 11857524:92253376 % iclog: max_er = 0x2fbaebea 1.4916, avg_er = 0.313, #>=1:0.5 = 103490:88996762 % icsqrt:max_er = 0x1fffffd0 1.0000, avg_er = 0.239, #>=1:0.5 = 0:27082 % iccos: max_er = 0x524fa191 2.5722, avg_er = 0.144, #>=1:0.5 = 18085136:346402008 % iccosh:max_er = 0x524fa191 2.5722, avg_er = 0.144, #>=1:0.5 = 18085136:346402008 % icsin: max_er = 0x52a0d11e 2.5821, avg_er = 0.105, #>=1:0.5 = 14563920:304694328 % icsinh:max_er = 0x56838440 2.7036, avg_er = 0.095, #>=1:0.5 = 21624908:122655056 % ictan: max_er = 0x9c7f786c 4.8906, avg_er = 0.159, #>=1:0.5 = 635992:2189748 % ictanh:max_er = 0x9708bb4b 4.7198, avg_er = 0.104, #>=1:0.5 = 177976:980356 % % amd64 double prec, on 2**16 x 2**16 args: % not done yet % % i386 float prec, on 2**16 x 2**16 args: % rcacos:max_er = 0x50899c13 2.5168, avg_er = 0.323, #>=1:0.5 = 1109224:63935432 % rcacosh:max_er = 0x38676cc7 1.7626, avg_er = 0.256, #>=1:0.5 = 888252:812489596 % rcasin:max_er = 0x5ad0db95 2.8380, avg_er = 0.223, #>=1:0.5 = 8301060:96238644 % rcasinh:max_er = 0x38676cc7 1.7626, avg_er = 0.256, #>=1:0.5 = 888252:812489596 % rcatan:max_er = 0x3dfcde4c 1.9371, avg_er = 0.287, #>=1:0.5 = 7573764:74078480 % rcatanh:max_er = 0x2f90804a 1.4864, avg_er = 0.155, #>=1:0.5 = 493896:87730140 % rclog: max_er = 0x2f626046 1.4808, avg_er = 0.247, #>=1:0.5 = 69520:7720848 % rcsqrt:max_er = 0x1fffffd0 1.0000, avg_er = 0.239, #>=1:0.5 = 0:27082 % rccos: max_er = 0x56fb2dd4 2.7182, avg_er = 0.089, #>=1:0.5 = 14830032:79662628 % rccosh:max_er = 0x56fb2dd4 2.7182, avg_er = 0.089, #>=1:0.5 = 14830032:79662628 % rcexp: max_er = 0x1658ca3e2ce252 11716177.9430, avg_er = 0.196, #>=1:0.5 = 15191710:107310178 % rcsin: max_er = 0x560146c2 2.6877, avg_er = 0.095, #>=1:0.5 = 21025048:121830712 % rcsinh:max_er = 0x52a0d11e 2.5821, avg_er = 0.105, #>=1:0.5 = 14018592:299616036 % rctan: max_er = 0xa0945cb2 5.0181, avg_er = 0.095, #>=1:0.5 = 18625048:215669808 % rctanh:max_er = 0x86db180b 4.2142, avg_er = 0.142, #>=1:0.5 = 112980788:493723688 % icacos:max_er = 0x38676cc7 1.7626, avg_er = 0.256, #>=1:0.5 = 888252:812489596 % icacosh:max_er = 0x50899c13 2.5168, avg_er = 0.323, #>=1:0.5 = 1109224:63935432 % not redone recently past here: % icasin:max_er = 0x40a05dd0 2.0196, avg_er = 0.261, #>=1:0.5 = 5281044:852735716 % icasinh:max_er = 0x67e2ed1b 3.2465, avg_er = 0.224, #>=1:0.5 = 8821392:96082548 % icatan:max_er = 0x2f1b30d9 1.4721, avg_er = 0.150, #>=1:0.5 = 473412:98823608 % icatanh:max_er = 0x3e698bd3 1.9504, avg_er = 0.284, #>=1:0.5 = 2928868:53999712 % iclog: max_er = 0x2f3ab89e 1.4759, avg_er = 0.314, #>=1:0.5 = 85208:86449406 % icsqrt:max_er = 0xfffffd1 0.5000, avg_er = 0.243, #>=1:0.5 = 0:0 % iccos: max_er = 0xb8a89cc9da6f8ecf 24784463438.8261, avg_er = 4.375, #>=1:0.5 = 985914352:1145676100 % iccosh:max_er = 0xb8a89cc9da6f8ecf 24784463438.8261, avg_er = 4.375, #>=1:0.5 = 985914352:1145676100 % icexp: max_er = 0xb8a3a12532b83c66 24781850921.5850, avg_er = 3.275, #>=1:0.5 = 989776068:1097527556 % icsin: max_er = 0xb89efbb1b6eff235 24779414925.7168, avg_er = 0.772, #>=1:0.5 = 984470420:1108121412 % icsinh:max_er = 0xb8a89cc9da6f8ecf 24784463438.8261, avg_er = 3.803, #>=1:0.5 = 990393948:1068699356 % ictan: max_er = 0x1a19443fc048dbb 218931743.8756, avg_er = 4.826, #>=1:0.5 = 748442224:916580252 % ictanh:max_er = 0xb68ffe828cb533f6 24503120916.3971, avg_er = 5.660, #>=1:0.5 = 1002055892:1086633900 All the old tests use the -current, broken i387 cos, sin and tan. I had to kill these to get tests to pass, and did this earlier in most previously reported results. % % i386 double prec, on 2**16 x 2**16 args: % rcacos:max_er = 0x12df 2.3589, avg_er = 0.192, #>=1:0.5 = 170980:25565816 % rcacosh:max_er = 0xd29 1.6450, avg_er = 0.257, #>=1:0.5 = 35456:693380268 % rcasin:max_er = 0x1751 2.9146, avg_er = 0.166, #>=1:0.5 = 1964988:25241352 % rcasinh:max_er = 0xd29 1.6450, avg_er = 0.257, #>=1:0.5 = 35456:693380268 % rcatan:max_er = 0xd8b 1.6929, avg_er = 0.015, #>=1:0.5 = 1105980:15127868 % rcatanh:max_er = 0xb56 1.4170, avg_er = 0.118, #>=1:0.5 = 65700:22533940 % rclog: max_er = 0xa52 1.2900, avg_er = 0.250, #>=1:0.5 = 776:6095880 % rcsqrt:max_er = 0x7ff 0.9995, avg_er = 0.250, #>=1:0.5 = 18:322256 % icacos:max_er = 0xd29 1.6450, avg_er = 0.257, #>=1:0.5 = 35456:693380268 % icacosh:max_er = 0x12df 2.3589, avg_er = 0.192, #>=1:0.5 = 170980:25565816 % icasin:max_er = 0xd29 1.6450, avg_er = 0.257, #>=1:0.5 = 35456:693380268 % icasinh:max_er = 0x1751 2.9146, avg_er = 0.166, #>=1:0.5 = 1964988:25241352 % icatan:max_er = 0xb56 1.4170, avg_er = 0.118, #>=1:0.5 = 65700:22533940 % icatanh:max_er = 0xd8b 1.6929, avg_er = 0.015, #>=1:0.5 = 1105980:15127868 % iclog: max_er = 0x981 1.1880, avg_er = 0.251, #>=1:0.5 = 28630:39211728 % icsqrt:max_er = 0x400 0.5000, avg_er = 0.250, #>=1:0.5 = 0:304172 Test coverage is not really adequate to find results in double precision -- it tests little more than values of the form 2**m + I * 2**n for integer values m and n. But it tests all of these values, so is almost sure to find problems near 0 and infinity, and with NaNs. Bruce From owner-freebsd-numerics@FreeBSD.ORG Thu Sep 13 13:13:04 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D86EA1065673 for ; Thu, 13 Sep 2012 13:13:04 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 91E7A8FC27 for ; Thu, 13 Sep 2012 13:13:04 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q8DDCvUa021500; Thu, 13 Sep 2012 08:12:57 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5051DBD9.9080200@missouri.edu> Date: Thu, 13 Sep 2012 08:12:57 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org> In-Reply-To: <20120913204808.T1964@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Sep 2012 13:13:05 -0000 I'll look over these changes. But let me say one thing. When the code is committed, I am intending that clog_for_large_values will be replaced by clog (or clogf or clogl), which will be your code. So I am not going to worry too much about fixing up clog_for_large_values. But I'll look at all the other comments soon. From owner-freebsd-numerics@FreeBSD.ORG Thu Sep 13 14:01:05 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AEC10106564A for ; Thu, 13 Sep 2012 14:01:05 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail18.syd.optusnet.com.au (mail18.syd.optusnet.com.au [211.29.132.199]) by mx1.freebsd.org (Postfix) with ESMTP id 3D4FF8FC18 for ; Thu, 13 Sep 2012 14:01:04 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail18.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q8DE0tGw019094 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 14 Sep 2012 00:00:57 +1000 Date: Fri, 14 Sep 2012 00:00:55 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <5051DBD9.9080200@missouri.edu> Message-ID: <20120913232218.E2426@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org> <5051DBD9.9080200@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Sep 2012 14:01:05 -0000 On Thu, 13 Sep 2012, Stephen Montgomery-Smith wrote: > I'll look over these changes. But let me say one thing. When the code is > committed, I am intending that clog_for_large_values will be replaced by clog > (or clogf or clogl), which will be your code. So I am not going to worry too > much about fixing up clog_for_large_values. I think it should be remain separate. It only needs very special subcases of clog(), and has already these cases, so it can be about twice as fast for the log(|z|) part by not repeating things and by taking advantage of the specialness. The atan2() part can't be simplified much. For the log(|z|) part, the best method is approximately to scale x and y down to near 1 and then just use x*x + y*y. I tried this in clog() too, but it didn't work so well there because it only handles 1 subcase and it gets in the way of other, more interesting cases. The scaling step may be too messy for clog_for_large_values() too. If so, I would copy the relevant part of the current algorithm from clog() to here. This would be slightly smaller than the current clog_for_large_values(). Bruce From owner-freebsd-numerics@FreeBSD.ORG Thu Sep 13 15:02:57 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 61E49106566B for ; Thu, 13 Sep 2012 15:02:57 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 1D1458FC14 for ; Thu, 13 Sep 2012 15:02:56 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q8DF2qji029120; Thu, 13 Sep 2012 10:02:54 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5051F59C.6000603@missouri.edu> Date: Thu, 13 Sep 2012 10:02:52 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org> In-Reply-To: <20120913204808.T1964@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Sep 2012 15:02:57 -0000 On 09/13/2012 07:55 AM, Bruce Evans wrote: > % - /* casinh(NaN + I*inf) = inf + I*NaN */ > % + /* casinh(NaN + I*+-Inf) = +-Inf + I*NaN (sign optional) */ How much do you want the "sign optional" put into the comments. This will have to be done in lots of places. One more thing - I made a mistake in the comments for FOUR_SQRT_MIN. It should read: FOUR_SQRT_MIN = 0x1p-509, /* greater than 4 * sqrt(DBL_MIN) */ that is, "greater" not "less". From owner-freebsd-numerics@FreeBSD.ORG Thu Sep 13 16:41:18 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5377D1065698 for ; Thu, 13 Sep 2012 16:41:18 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by mx1.freebsd.org (Postfix) with ESMTP id D59778FC08 for ; Thu, 13 Sep 2012 16:41:17 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q8DGfEIh026374 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 14 Sep 2012 02:41:16 +1000 Date: Fri, 14 Sep 2012 02:41:14 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <5051F59C.6000603@missouri.edu> Message-ID: <20120914014208.I2862@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Sep 2012 16:41:18 -0000 On Thu, 13 Sep 2012, Stephen Montgomery-Smith wrote: > On 09/13/2012 07:55 AM, Bruce Evans wrote: > >> % - /* casinh(NaN + I*inf) = inf + I*NaN */ >> % + /* casinh(NaN + I*+-Inf) = +-Inf + I*NaN (sign optional) */ > > How much do you want the "sign optional" put into the comments. This will > have to be done in lots of places. Eventually, always, but I'd like to have a convention to keep the comment short. Possible conventions: - use inf for the simple case where signs interact unsurprisingly. inf for an input operand combined with inf for an output operand means that the input operand is copied unchanged; add '-' to the output operand to indicate a sign change. - use +-Inf particular signs and/or surprising sign rules - combinations with 2 input infinities probably need to spell them as inf1 and inf2 or +-Inf1 and +-Inf2 so as to express combinations of them in output - write options using something like opt(