From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:43 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D115106566B for ; Sun, 12 Aug 2012 23:11:43 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id DF4B88FC14 for ; Sun, 12 Aug 2012 23:11:42 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBgug075897 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:42 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBaMX021764 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:36 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBaNU021763 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:36 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:36 +1000 Resent-Message-ID: <20120812231136.GN20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MFObAe056537 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 01:24:37 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MFOYkU006698 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 01:24:36 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6MFO9Fc054062; Sun, 22 Jul 2012 10:24:11 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500C1B1A.5070107@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> <500B594D.1020305@missouri.edu> <20120722125300.P2246@besplex.bde.org> In-Reply-To: <20120722125300.P2246@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:43 -0000 X-Original-Date: Sun, 22 Jul 2012 10:24:10 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:11:43 -0000 I will study your comments on KNF later, and try rewriting the code appropriately. In the mean time, could you send me a copy of your set up for non-manually comparing C answers with pari answers? Speed is really not an issue. My comparisons with Mathematica were definitely manual. The main issue as I see it is this. How do different programs communicate double precision numbers to each other, without loosing precision? I don't want one program to convert to base 10, and then have the other program read it and convert back to its internal floating point? I can see from the C end how to do it (using EXTRACT_WORDS etc). Can pari do this as well? mpfr is OK, but terribly cludgy. Having it compute clog is easy. But having it computing casinh is going to be very painful. If I cannot use pari, I will probably use C++ code based around a floating point package called cln. It is used in GiNaC, which is a rather cool way of embedding symbolic expressions inside C++ code. They are both available in the ports. I see on the internet that there is a C++ overlay of mpfr as well. But it doesn't seem to be in the FreeBSD ports. I am not in the mood to create a new port just now. The C++ overlay of gmp does exist, but gmp doesn't even have log and atan2. Incidentally, I just now realized that my test code for clog needs more than 100 bits. I'll try to fix that this afternoon. Maybe that is why I am seeing examples of 1e-30 in the real part of clog. On 07/22/2012 01:19 AM, Bruce Evans wrote: > On Sat, 21 Jul 2012, Stephen Montgomery-Smith wrote: > >> 1. I think I now understand the casinh algorithm by Hull et al rather >> well. I implemented it in the attached code. Bruce, could you try it >> out against pari? It did compare rather well to Mathematica's ArcSinh >> function. I tested it mostly near the branch cut ends at I and -I. >> (The branch cut's behavior is somewhat similar to that of sqrt.) > > I'm actually using my test framework on the real part, and then pari > to manually check differences found by this. The test framework checks > a few billion cases (not manually :-) in a minute or 2. pari is much > slower. I don't use mfpr (spelling?) yet. I think it would be slow > too, but not as slow as pari. However, my test framework doesn't > really handle long doubles, so I use pari (not manually) to check a > few million cases (in the same time as a few billion for the C version) > for them. > >> 2. I have thought more about the problem of computing clog(z) when >> |z| is close to 1. I now think it might even require precision that >> is 3 times better than double precision. >> It is possible that you could, by chance, pick x and y so that |z| = 1 >> + 1e30. (I wrote a test program, and this did > e-30 here and elsewhere >> actually happen a few times.) So when you compute x^2+y^2-1, if you >> do it using double double precision, you will get 2e30, but only the >> most significant digits will be accurate. The ULP will be about >> 1e15. You need triple double precision to get ULP's close to 1. > > Hmm, I'm not seeing such cases except when |x| or |y| is 1 and the other > is tiny. I'm only using 21 bits of extra precision now. I think that > has been tested for |z| much closer to 1 than 2^-21, so it is avoiding > more than 21 bits of cancelation, but nothing like the ~100 bits for > 1e-30 away from 1. I hope it doesn't get close. More testing and > analysis is required. > > It is certainly amazingly difficult to subtract 1 without losing accuracy. > >> And I cannot even figure out how to do log(1+z) when z is close to 0! >> The trouble is, (1+x)^2+y^2-1 = 2x+x^2+y^2, and if x is negative and >> approximately -y^2, you are in the situation of subtracting nearly >> equal numbers! (Obviously if z is very close to 0, I could use >> Taylor's series.) > > This is a bit easier, and the hardest subcase isn't required for clog(): > - the case of z very close to 0 is not actually obvious. > - (A) For z = x + I*y with x = 0 and y tiny, log(1+z) ~= (y/2)*y. > (Not y*y/2 since that loses an ulp by spuriously underflowing when > y is slightly larger than sqrt(the smallest strictly positive > representable value.) > - (B) We get the case where 2x almost cancels with y^2 when x is tiny > and > negative and y ~= sqrt(|x|). x has to be fairly tiny to cause > problems. > This case is unreachable for clog() because we start with x near 1 > and subtract 1 from it. This always gives a non-tiny x (|x| >= > DBL_MIN/2). This is a sub-case of (C). > - (C) when z is merely close to 0, the cancelations are not very large. > > I was starting to consider the following case for clog(): |x| much larger > (relatively) than |y|. |x| may or may not be near 1. Then hypot() just > returns |x|. For clog(), |x| needs to dominate |y| by much more for > |y| to be ignored, especially when |x| = 1. When |x| dominates |y| for > hypot() but not for clog(), we are in the solved case (A) or the non- > canceling case (C). > > The remaining problematic case for clog() is when neither x nor y is small, > and x^2+y^2 is nearly 1 (say both near 1/sqrt(2)). It seems to be > necessary > to square them using doubled double precision. Then subtract 1 from x^2 > (no more precision required). Then subtract y^2 (no more precision > required, and the result needs only double precision plus a couple of > guard bits). The squaring is currently only done in ~sesqui double > precision, which only handles most cases. 24 extra bits is ~29 fewer > than needed, but it handles all problematic cases except ~1 in every > 2^24 (assuming uniform distribution in double space), and the problematic > cases are already sparse. My longest test was of 2^32 cases (takes half > an hour). This hits some problematic cases, but only for the Apple > implementation because it uses _no_ extra bits except when real(z) == 1). > Its maximum error detected increased from 1 ulp to 16 ulps when the number > of test cases was increased by a factor of 16. > >> 3. I haven't learned proper style yet. (Is it what you call KNF?) I >> have always had a distrust of consistency, especially when it comes to >> people's coding styles. Sorry, but this is going to be a bit painful >> for me. > > The main non-KNFisms are comment indentation, comment filling, and spaces > around binary operators. indent [-npro] does a good job of fixing the > first 2 and a fairly good job with the spaces, but makes some messes in > the code. Mathematicians like to omit spaces (and multiplication operators > ...) and use single letters for identifiers, but programmers learned that > this gives hard-to-maintain code. > > % ... > % /* > % * gcc doesn't implement complex multiplication or division correctly, > % * so we need to handle infinities specially. We turn on this pragma to > % * notify conforming c99 compilers that the fast-but-incorrect code that > % * gcc generates is acceptable, since the special cases have already been > % * handled. > % */ > % #pragma STDC CX_LIMITED_RANGE ON > > This comment (and code) was copied from s_csqrt.c, and is filled normally > (except for the single-space sentence break). > > I think this doesn't actually apply here or in s_clog.c, since no complex > operations are done directly. > > % ... > % /* > % The algorithm is very close to that in > % "Implementing the complex arcsine and arccosine functions using exception > % handling" by T. E. Hull, Thomas F. Fairgrieve, and Ping Tak Peter Tang, > % ... > % */ > > This and most block comments aren't filled normally (with a column of > stars at the left). > > When indent(1) fixes these, it mangles the formatting of displayed > formulas and similar things. Comments with manual formatting should be > marked up by starting them with "/*-" to prevent indent mangling them. > s_clog.c doesn't have any comments with displayed formulas, and indent > did a good job of reformatting their paragraphs. I normally tell indent > not to reformat any block comments. > > % ... > % static double f(double x, double y, int *underflow) { > > The left brace at the start of a function should be on a new line. > > % if (x==0) { > % *underflow = 0; > % if (y > 0) > % return 0; > % return -y; > % } > > This has spaces around some of the binary operators but not all. > > Return expressions are parenthesized in KNF. indent doesn't fix this. > > % ... > % double complex > % casinh(double complex z) > % { > > Normal now. > > % ... > % if (cabs(z) > 1e20) { > % if (huge+x>one) { /* set inexact flag. */ > % if (sx == 0) return clog(2*z); > % if (sx == 1) return -clog(-2*z); > % } > % } > > Use a new line for all statements. Especially return statements. For > among other reasons, so that it is as easy as possible to manage > statements in line-oriented debuggers. > > % ... > % if (A < 1.5) { > % fp = f(x,1+y,&fpuf); > % fm = f(x,1-y,&fmuf); > % if (fpuf == 1 && fmuf == 1) { > % if (huge+x>one) /* set inexact flag. */ > % rx = log1p(x*sqrt((fp+fm)*(A+1))); > % } else if (fmuf == 1) { > % /* Overflow not possible because fp < 1e50 and x > 1e-100. > % Underflow not possible because either fm=0 or fm approximately > % bigger than 1e-200. */ > % if (huge+x>one) /* set inexact flag. */ > % rx = log1p(fp+sqrt(x)*sqrt((fp/x+fm*x)*(A+1))); > > Comments should be indented the same as the code that they describe. I > found ones like the above especially hard to read (much harder than the > compressed formulas). > > indent -npro turns (some of) the above into the following: > > @ ... > @ static const double > @ one = 1.00000000000000000000e+00, /* 0x3FF00000, > @ * 0x00000000 */ > @ huge = 1.00000000000000000000e+300; > > Mangled. > > @ @ /* > @ * The algorithm is very close to that in "Implementing the complex > arcsine > @ * and arccosine functions using exception handling" by T. E. Hull, > Thomas F. > @ * Fairgrieve, and Ping Tak Peter Tang, published in ACM Transactions on > @ * Mathematical Software, Volume 23 Issue 3, 1997, Pages 299-335. > @ * http://dl.acm.org/citation.cfm?id=275324 > @ * @ * casinh(x+iy) = sign(x)*log(A+sqrt(A*A+1)) + sign(y)*I*acos(B) > where A = > @ * 0.5(|z+I| + |z-I|) = f(x,1+y) + f(x,1-y) + 1 B = 0.5(|z+I| - |z-I|) > z = > @ * x+I*y f(x,y) = 0.5*(hypot(x,y)-y) We also use asin(B) = > @ * atan2(sqrt(A*A-y*y),y) A-y = f(x,y+1) + f(x,y-1). > @ * @ * Much of the difficulty comes because computing f(x,y) may > produce underflows. > @ */ > > Grossly mangled. > > @ ... > @ static double @ f(double x, double y, int *underflow) > @ { > @ if (x == 0) { > @ *underflow = 0; > @ if (y > 0) > @ return 0; > @ return -y; > @ } > > Fixed, except for the return statements (and a weirder KNFism which I > won't describe now). > > @ ... > @ double complex > @ casinh(double complex z) > @ { > @ double x , y, sx, sy, rx, ry; > @ double R , S, A, B, fp, fm; > @ int fpuf , fmuf; > > Mangled. indent's default profile is bad. > > @ ... > @ if (cabs(z) > 1e20) { > @ if (huge + x > one) { /* set inexact flag. */ > @ if (sx == 0) > @ return clog(2 * z); > @ if (sx == 1) > @ return -clog(-2 * z); > @ } > @ } > > indent actually understands KNF for most code. It added spaces and > newlines here. > > @ ... > @ } else if (fmuf == 1) { > @ /* > @ * Overflow not possible because fp < 1e50 and x > > @ * 1e-100. Underflow not possible because either fm=0 > @ * or fm approximately bigger than 1e-200. > @ */ > @ if (huge + x > one) /* set inexact flag. */ > @ rx = log1p(fp + sqrt(x) * sqrt((fp / x + fm * x) * (A > + 1))); > @ } else if (fpuf == 1) { > @ /* Similar arguments against over/underflow. */ > > Comments reformatted and indented OK. > > Some lines were mangled (made too long) by adding spaces. indent doesn't > understand its own -lp option. > > s_clog.c has simpler expressions that mostlu didn't expand too much. > > Bruce > >