Date: Mon, 17 Sep 2012 23:02:19 -0500 From: Stephen Montgomery-Smith <stephen@missouri.edu> To: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions Message-ID: <5057F24B.7020605@missouri.edu> In-Reply-To: <5057A932.3000603@missouri.edu> References: <5017111E.6060003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On 09/17/2012 05:50 PM, Stephen Montgomery-Smith wrote: >> cacos*() and casin*() should benefit even more from an up-front raising >> of inexact, since do_hard_work() has 7 magic statements to raise inexact >> where sum_squares has only 1. > > Where is the code that raises inexact up-front? I don't see why having code upfront will make it much more efficient. Out of these 7 magic statements, at most two of them will be called. But I could put something like if ((x == 0 && y == 0) || (x == 0 && y == 1) || (int)(1+tiny) == 1) { ........ at the beginning of do_hard_work and catanh. >> ... I now understand what the threshold should be. You have >> filtered out ax == 1. This makes 1 - ax*ax at least ~2*EPSILON, so >> ay*ay can be dropped if ay is less than sqrt(2*EPSILON*EPSILON) * >> 2**-GUARD_DIGITS = EPSILON * 2**-5 say. SQRT_MIN is way smaller >> than that, so FOUR_SQRT_MIN works too. We should use a larger >> threshold for efficiency, or avoid the special case for ax == 1. >> Testing shows that this analysis is off by a factor of about >> sqrt(EPSILON), since a threshold of EPSILON * 2**7 is optimal. >> The optimization made no difference to speed; it is just an >> optimization for understanding. Maybe the special case for ax == 1 >> can be avoided, or folded together with the same special case for >> evaluation of the real part. This special case is similar to the >> one in clog(), but easier. OK, I think I made changes more or less according to your suggestions. In the case A < A_crossover, a threshold like DBL_EPSILON*DBL_EPSILON/128 is required. I think the one you set is too large. It is important that sqrt(x) + x/2 is sqrt(x). (Again I don't think your tests would pick this up, because you need to do a lot of tests where y is close to or equal to 1.)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5057F24B.7020605>