Date: Wed, 12 Sep 2012 20:26:54 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Stephen Montgomery-Smith <stephen@missouri.edu> Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions Message-ID: <20120912195727.H1078@besplex.bde.org> In-Reply-To: <504D294F.1050901@missouri.edu> References: <5017111E.6060003@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> <503265E8.3060101@missouri.edu> <5036EFDB.3010304@missouri.edu> <503A9A1A.1050004@missouri.edu> <503ABAC4.3060503@missouri.edu> <20120906200420.H1056@besplex.bde.org> <504D294F.1050901@missouri.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 9 Sep 2012, Stephen Montgomery-Smith wrote: > On 09/06/2012 06:42 AM, Bruce Evans wrote: >> I'm back from a holiday. >> >> On Sun, 26 Aug 2012, Stephen Montgomery-Smith wrote: > >> I tested your latest version and found regressions for NaNs. (Also, >> the float version is less accurate now that it doesn't call double >> functions, but that is a progression.) >> 1. casin*() abuses x and y for |x| and |y| and thus loses the sign of >> NaNs. The quick fix is to reload the original values. The correct >> fix is to use separate variables for the absolute values, as is >> done for cacosh*() and catanh*(). > > I now uniformly use ax and ay for |x| and |y| throughout. > >> 2. NaNs are now filtered out early > > It is not correct to filter out the NaNs early. This is because of the way > expressions like cacosh(NaN + I*INF) are defined in C99. The INFs must be > filtered out first. Oops. I still prefer filtering out NaNs first. They already have a special case for one NaN and one +-0 (with +-0 only special for y in cacos()), and can 4 more for one Nan and one Inf (2 combinations in each of 2 functions) just as easily as the Inf cases can have special subcases for NaNs. Here is the Inf case for casinhf(): % if (ax > 1/FLT_EPSILON || ay > 1/FLT_EPSILON) % if (isinf(x) || isinf(y) || (int)(1+tiny)==1) { % if (signbit(x) == 0) % w = clog_for_large_values(z) + M_LN2; Adding M_LN2_ risks problems for NaNs. These were fatal for cacos*() so we avoid them there. By filtering out NaNs first, we avoid them better. M_LN2 is a double constant, so using it pessimizes the float case (it happens not to break it). % else % w = clog_for_large_values(-z) + M_LN2; -z does arithmetic on a complex value so it risks doing bad things for NaNs. At best, it toggles the sign of a NaN. clog_for_large_values() always clears the sign bit of NaNs, and thus gives inconsistent sign handling to cases which don't go through here. Filtering out NaNs first would avoid these cases. clog_for_large_values() is pessimized a bit for the usual case by having to handle NaNs: @ static float complex @ clog_for_large_values(float complex z) @ { @ float x, y; @ float ax, ay, t; @ @ x = crealf(z); @ y = cimagf(z); @ @ @ if (x != x || y != y) @ return (cpackf(logf(hypotf(x, y)), atan2f(y, x))); This is `if (isnan(x) || isnan(y))' spelled fairly optimally. In the usual case, this will be repeated in the caller, with pessimal spelling for the float and long double cases and different fairly optimal spelling for the double case. By filtering out the NaN cases early, we can avoid repeating this. >> % @@ -289,7 +284,16 @@ >> % * the arguments is not NaN, so we opt not to raise it. >> % */ >> % - return (cpack(x+y, x+y)); >> % + return (cpack(x+0.0L+(y+0), x+0.0L+(y+0))); >> % } > > Why does the second way work, and the first way doesn't? Because the result may depend on the ALU, and only the second way is evaluated in the same ALU in all precisions. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120912195727.H1078>