From owner-freebsd-numerics@FreeBSD.ORG Thu Sep 6 16:32:23 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7C31810656D4 for ; Thu, 6 Sep 2012 16:32:23 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 1408C8FC15 for ; Thu, 6 Sep 2012 16:32:22 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q86GWBZh069525; Thu, 6 Sep 2012 11:32:14 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5048D00B.8010401@missouri.edu> Date: Thu, 06 Sep 2012 11:32:11 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <20120906221028.O1542@besplex.bde.org> In-Reply-To: <20120906221028.O1542@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Sep 2012 16:32:23 -0000 Life is busy, so it will be a few days before I can process most of what you have said. But let me respond to a few of the points. On 09/06/2012 08:50 AM, Bruce Evans wrote: > This code in catrig.c seems to be related: > > @ /* > @ * sum_squares(x,y) = x*x + y*y. > @ * Assumes x*x and y*y will not overflow. > @ * Assumes x and y are finite. > @ * Assumes fabs(x) >= DBL_EPSILON. > @ */ > @ inline static double > @ sum_squares(double x, double y) > @ { > @ /* > @ * The following code seems to do better than > @ * return (hypot(x, y) * hypot(x, y)); > @ */ > @ @ if (fabs(y) < MIN_4TH_ROOT) > @ if ((int)y==0) /* raise inexact */ > @ return (x*x); > @ return (x*x + y*y); > @ } > > but it isn't really -- it is more like the case of a large difference > of exponents, so x*x+y*y reduces to x*x (at a smaller exponent difference > than for hypot(x, y) reducing to |x|). hypot() sets inexact by returning > |x|+|y| in that case, but the above avoids using y*y since it would give > spurious underflow. > > Hmm, MIN_4TH_ROOT seems to be logically wrong here, and physically > wrong for float precision. In double precision, MIN_4TH_ROOT/DBL_EPSILON > is ~0.5e-59, but in float precision, MIN_4TH_ROOT/FLT_EPSILON is only > ~1e-2, so when x = FLT_EPSILON, y = MIN_4TH_ROOT is very far from giving > a discardable y*y. > > The logically correct threshold is something like y/x < 2**-MANT_DIG/2 > or y < 2**-MANT_DIG/2 * EPSILON to avoid the division. clog*() uses > essentially y/x < 2**-MANT_DIG for some reason (it actually uses a fuzzy > check based on exponent differences: kx - ky > MANT_DIG; MANT_DIG/2 > would be a little too small here, but doubling it seems wasteful). > hypot() uses essentially y/x < 2**-60, where the magic 60 is DBL_MANT_DIG > plys a safety margin. hypotf() uses a magic 30. The non-magic integer > exponent thresholds work very well in clog*(), but depend on having the > exponents handy for efficiency, which may be possible via optimization > for an inline function like the above but takes more source code. The code is different for catrigf.c for precisely the reason you stated. This is what I was talking about a while back when I said that the float code was a little harder than the double and long code. In the days or weeks to come, I might go over all these kinds of conditions over again. The paper by Hull et al, and the code used by boost, is written in such a way that the code for float is the same as the code for double. My code was designed just enough to work, whereas they put more thought into it. > I don't see how the above avoids cancelation problems like the ones in > clog(). You never subtract 1 from sum_squares(), but there are open-coded > expressions like '(1-ax)*(1+ax) - ay*ay' to reduce the cancelation error. > I found that for clog() it was worth using extra precision out to about > |z| = 4 (or |z| < 1/4) to reduce the cancelation error. But somehow my > tests didn't show large errors. Perhaps because I'm happy with < 4 ulps > instead of < 1. Because for clog, when x^2+y^2 is close to one, the real part of clog is very close to zero. Whereas when x^2+y^2 is close to 1, the imaginary part of catanh is close to Pi/4. So even though the absolute error in computing Im(catanh) might be much higher than Re(clog), the relative error is comparable.