Date: Sat, 9 Mar 2019 00:51:07 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Steve Kargl <sgk@troutmask.apl.washington.edu> Cc: freebsd-numerics@freebsd.org Subject: cexpl() (was: Re: Update ENTERI() macro) Message-ID: <20190308225807.D6410@besplex.bde.org> In-Reply-To: <20190307195436.GA20856@troutmask.apl.washington.edu> References: <20190304212159.GA12587@troutmask.apl.washington.edu> <20190305153243.Y1349@besplex.bde.org> <20190306055201.GA40298@troutmask.apl.washington.edu> <20190306225811.P2731@besplex.bde.org> <20190306184829.GA44023@troutmask.apl.washington.edu> <20190307061214.R4911@besplex.bde.org> <20190306214233.GA23159@troutmask.apl.washington.edu> <20190307144315.N932@besplex.bde.org> <20190307044447.GA16298@troutmask.apl.washington.edu> <20190307163958.V1293@besplex.bde.org> <20190307195436.GA20856@troutmask.apl.washington.edu>
index | next in thread | previous in thread | raw e-mail
On Thu, 7 Mar 2019, Steve Kargl wrote:
> On Thu, Mar 07, 2019 at 05:36:22PM +1100, Bruce Evans wrote:
>> On Wed, 6 Mar 2019, Steve Kargl wrote:
>>
>>> On Thu, Mar 07, 2019 at 03:09:06PM +1100, Bruce Evans wrote:
>* ...
>>> /*
>>> * exp_ovfl = 11356.5234062941439497, 0x400c, 0xb17217f7d1cf79ac
>>> * cexp_ovfl = 22755.3287906024445633, 0x400d, 0xb1c6a8573de9768c
>>> */
I checked that these are correct. Any rounding should be down for
exp_ovfl and up for cexp_ovfl.
>>> if ((hx == 0x400c && lx > 0xb17217f7d1cf79acULL) ||
>>> (hx == 0x400d && lx < 0xb1c6a8573de9768cULL)) {
>>
>> Don't duplicate the hex constants in the comment. The limits can be fuzzy
>> so don't need so many decimal digits or nonzero nonzero bits or LD80C()
>> to round them (s_cexp.c uses only 1 decimal digit and doesn't test the
>> lower 32 bits in the mantissa; here we have more bits than we need in lx).
>
> Being fuzzy was the movitation for the new macro, I was only using
> the upper 32 bits of lx in the original cexpl() I posted.
IIRC, that was done by extracting into a 32-bit lx. This could be written
as (lx >> 32) > 0xb17217f7 (or a fuzzier value), and it might be better
to do that anyway.
>> Maybe not use bit-fiddling at all for this (in other precisions too).
>> The more important exp() functions check thresholds in FP. Here x can
>> be NaN so FP comparisons with it don't work right. That seems to be
>> the only reason to check in bits here.
>>
>>> /*
>>> * x is between exp_ovfl and cexp_ovfl, so we must scale to
>>> * avoid overflow in exp(x).
>>> */
>>> RETURNI(__ldexp_cexpl(z, 1));
>>
>> Did your tests cover this case, and does it use the fixed version of the
>> old __ldexp_cexpl() in k_expl.h?
>
> No. The test I reported was restricted to -11350 < x < 11350.
> Spot checked x outside that range. I had the conditional wrong
> for x < -exp_ovfl. Also, the comment is actaully a little optimistic.
> One cannot avoid overflow. For example, x = 11357 is just above
> exp_ovfl, and exp(x) = inf. __ldexp_cexpl() does scaling and exploits
> |cos(y)| <= 1 and |sin(y)| <= 1 to try to extend the range of cexpl().
I understand this better now. __ldexp_cexpl() doesn't work for all large
x, so callers must only use it for a range of values above cexp_ovl.
The correctness of this is unobvious. It depends on sin(y) and cos(y)
never underflowing to 0. Otherwise, this method would give Inf * 0 = NaN
for finite x and y, and the scaling method would give finite * 0 * 2**k = 0.
>>> } else {
>>> /*
>>> * Cases covered here:
>>> * - x < exp_ovfl and exp(x) won't overflow (common case)
>>> * - x > cexp_ovfl, so exp(x) * s overflows for all s > 0
>>> * - x = +-Inf (generated by exp())
>>> * - x = NaN (spurious inexact exception from y)
>>> */
>>> exp_x = expl(x);
>>> sincosl(y, &s, &c);
>>> RETURNI(CMPLXL(exp_x * c, exp_x * s));
>>> }
>>> }
>>
>> I think the last clause can be simplfied and even optimized by using the
>> kernel in all cases:
I was wrong.
>> - the kernel produces exp_x in the form kexp_x * 2**k where kexp_x is near
>> 1. Just multiply kexp_x by c and s, and later multiply by 2**k. This
>> cost an extra multiplication by 2**k (for the real and imaginary parts
>> separately), but avoids other work so may be faster.
>> - hopefully kexp_x is actually >= 1 (and < 2). Otherwise we have to
>> worry about spurious underflow
>> - the above method seems to avoid spurious underflow automatically, and
>> doesn't even mention this (multiplying by exp_x may underflow, but not
>> spuriously since c and s are <= 1).
>
> You might be right about using the kernel. I'll leave that
> to others on freebsd-numerics@; although you and I seem to be
> the only subscribers.
Forget about that. __ldexp_cexp*() is not really a kernel. It is used
for ccosh*() and csinh*() too. Those can only use the kernel for large
positive x. cexp*() may as well do the same. Also, the kernel currently
doesn't support very large positive x, so the second overflow threshold
is needed in callers.
Bruce
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190308225807.D6410>
