FreeBSD Mail Archives

Date:      Mon, 17 Sep 2012 23:02:19 -0500
From:      Stephen Montgomery-Smith <stephen@missouri.edu>
To:        freebsd-numerics@freebsd.org
Subject:   Re: Complex arg-trig functions
Message-ID:  <5057F24B.7020605@missouri.edu>
In-Reply-To: <5057A932.3000603@missouri.edu>
References:  <5017111E.6060003@missouri.edu> <20120906221028.O1542@besplex.bde.org> <5048D00B.8010401@missouri.edu> <504D3CCD.2050006@missouri.edu> <504FF726.9060001@missouri.edu> <20120912191556.F1078@besplex.bde.org> <20120912225847.J1771@besplex.bde.org> <50511B40.3070009@missouri.edu> <20120913204808.T1964@besplex.bde.org> <5051F59C.6000603@missouri.edu> <20120914014208.I2862@besplex.bde.org> <50526050.2070303@missouri.edu> <20120914212403.H1983@besplex.bde.org> <50538E28.6050400@missouri.edu> <20120915231032.C2669@besplex.bde.org> <50548E15.3010405@missouri.edu> <5054C027.2040008@missouri.edu> <5054C200.7090307@missouri.edu> <20120916041132.D6344@besplex.bde.org> <50553424.2080902@missouri.edu> <20120916134730.Y957@besplex.bde.org> <5055ECA8.2080008@missouri.edu> <20120917022614.R2943@besplex.bde.org> <50562213.9020400@missouri.edu> <20120917060116.G3825@besplex.bde.org> <50563C57.60806@missouri.edu> <20120918012459.V5094@besplex.bde.org> <5057A932.3000603@missouri.edu>

On 09/17/2012 05:50 PM, Stephen Montgomery-Smith wrote:

>> cacos*() and casin*() should benefit even more from an up-front raising
>> of inexact, since do_hard_work() has 7 magic statements to raise inexact
>> where sum_squares has only 1.
>
> Where is the code that raises inexact up-front?

I don't see why having code upfront will make it much more efficient. 
Out of these 7 magic statements, at most two of them will be called.

But I could put something like

if ((x == 0 && y == 0) || (x == 0 && y == 1) || (int)(1+tiny) == 1) {
........
at the beginning of do_hard_work and catanh.

>> ...  I now understand what the threshold should be.  You have
>> filtered out ax == 1.  This makes 1 - ax*ax at least ~2*EPSILON, so
>> ay*ay can be dropped if ay is less than sqrt(2*EPSILON*EPSILON) *
>> 2**-GUARD_DIGITS = EPSILON * 2**-5 say.  SQRT_MIN is way smaller
>> than that, so FOUR_SQRT_MIN works too.  We should use a larger
>> threshold for efficiency, or avoid the special case for ax == 1.
>> Testing shows that this analysis is off by a factor of about
>> sqrt(EPSILON), since a threshold of EPSILON * 2**7 is optimal.
>> The optimization made no difference to speed; it is just an
>> optimization for understanding.  Maybe the special case for ax == 1
>> can be avoided, or folded together with the same special case for
>> evaluation of the real part.  This special case is similar to the
>> one in clog(), but easier.

OK, I think I made changes more or less according to your suggestions.

In the case A < A_crossover, a threshold like 
DBL_EPSILON*DBL_EPSILON/128 is required.  I think the one you set is too 
large.  It is important that sqrt(x) + x/2 is sqrt(x).  (Again I don't 
think your tests would pick this up, because you need to do a lot of 
tests where y is close to or equal to 1.)

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5057F24B.7020605>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation