From owner-freebsd-numerics@freebsd.org  Sat Dec 16 01:33:31 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 96646E931DE
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Sat, 16 Dec 2017 01:33:31 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 7D58D6F2C1
 for <freebsd-numerics@freebsd.org>; Sat, 16 Dec 2017 01:33:31 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id vBG1XPdB029874
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO)
 for <freebsd-numerics@freebsd.org>; Fri, 15 Dec 2017 17:33:25 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id vBG1XPt4029873
 for freebsd-numerics@freebsd.org; Fri, 15 Dec 2017 17:33:25 -0800 (PST)
 (envelope-from sgk)
Date: Fri, 15 Dec 2017 17:33:25 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: freebsd-numerics@freebsd.org
Subject: progress on powl.
Message-ID: <20171216013325.GA27344@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.7.2 (2016-11-26)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Dec 2017 01:33:31 -0000

All,

As expect no one elase is working on powl, I have
dusted off my work-in-progress that is now some
1+ year in the works. I've now got a sort of 
working version in very limited testing.

% ./testl -a 1.2345 3.124
libmu = LD80C(0xf7303d48d62e9df4, 0,  1.93115964943387024991e+00L),
mpfru = LD80C(0xf7303d48d62e9df5, 0,  1.93115964943387025002e+00L),
 ulp = 0.73609
 
I'm still slowing working out some thresholds and
few other kinks.  The biggest problem is that 
src/e_pow.c may be the worse commented code that 
I've had the pleasure to read.  There are 3 polynomial
approximations.  One is easy to work.  The other two
are gaint mysteries at moment.

I'll note that src/e_powf.c uses the same polynomials
and it appears that these use naively rounded coefficients.
I suspect that these polynomials can be reduced to
lower order, but have pursued that yet.

 

-- 
Steve

From owner-freebsd-numerics@freebsd.org  Sat Dec 16 04:09:02 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7F8CBE9B746
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Sat, 16 Dec 2017 04:09:02 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au
 [211.29.132.249])
 by mx1.freebsd.org (Postfix) with ESMTP id 316B8758D7
 for <freebsd-numerics@freebsd.org>; Sat, 16 Dec 2017 04:09:01 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 78395104356D;
 Sat, 16 Dec 2017 15:08:53 +1100 (AEDT)
Date: Sat, 16 Dec 2017 15:08:52 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
cc: freebsd-numerics@freebsd.org
Subject: Re: progress on powl.
In-Reply-To: <20171216013325.GA27344@troutmask.apl.washington.edu>
Message-ID: <20171216135205.D971@besplex.bde.org>
References: <20171216013325.GA27344@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=KeqiiUQD c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=rCpuhL93WexzyIwt4mAA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Dec 2017 04:09:02 -0000

On Fri, 15 Dec 2017, Steve Kargl wrote:

> As expect no one elase is working on powl, I have
> dusted off my work-in-progress that is now some
> 1+ year in the works. I've now got a sort of
> working version in very limited testing.
> ...
> I'm still slowing working out some thresholds and
> few other kinks.  The biggest problem is that
> src/e_pow.c may be the worse commented code that
> I've had the pleasure to read.  There are 3 polynomial
> approximations.  One is easy to work.  The other two
> are gaint mysteries at moment.

It seems OK to me.  I only see 2 polys and a simple rational function.

One for log() is obviously needed, and the special one used is
documented.  P* for exp() is not documented in e_pow.c and is not
obvious, but is identical to P*() for exp() in e_exp.c.  exp()
converges too slowly so is reduced to a special rational function
related to (exp(x)+1)/(exp(x)-1) which is related to Bernoulli numbers
and thus to lots of interesting power series.  The special rational
function is basically just this rational function in exp(x) inverted.

It is suprising that P* is accurate enough for pow().  exp()
exponentiates errors.  It is apparently enough to calculate y*log2(x)
= n + y' accurately.  Since |y'| <= 0.5, the errors don't get
exponentiated.  But then we have to evaluate exp(y'*log2) accurately,
and I'm surprised if the naive method in the comment is enough.  We
should evaluate y' in extra precision and used this for calculating
y'*log2, and I think we do.  Then ignoring the error from this is
apparently OK.  It should be < 0.5 ulps, and then it is easy to avoid
adding more than 1 ulp of error for exp() but not so easy to avoid
adding more than 0.5 ulps of error.  However, if we just use exp(x) =
1 + x + 2*x/2 + ..., then this is easy enough.  We have x in extra
precision and should add it to 1 in extra precision, and the other
terms are too small to contribute much to the error.  Different care
is needed for the rational function.

My clog*(z) has similar problems for the real part.  At the end, it must
calculate log*(x) where x has extra precision.  Unfortunately, only
the internal kernels of logl() and of my uncommitted version of log()
and logf() support extra precision.  Since I don't want to bloat clog()
by open-coding these internals like pow() does for exp()'s internals,
I just call the log*() functions.  This blows out the error to up to ~1.8
ulps.

See pow.c in 4.4BSD libm for a more brute force approach that is probably
better.  Part are the same, down to misspelling "multi" as "muti".  Maybe
a watermark.  We still have extra-precision exp and log functions which
were used for this and which w use for tgamma.  Unfortunately, these are
only extra for double precision.  This prevents a simple translation of
tgamma() too.  The kernels exp__D.c and log__L.c are actually simple,
especially the latter, so this is not a good excuse for not translating
them.

BTW, you (er, I) have not got around to committing my double and float
versions of coshl(), sinhl() and tanhl().  But expl() already has your
kernel in a non-external place.  The extra-precision kernels should
work better than the open-coded ones in powl().  I now remember that
even the double and float internal ones have a couple of bits of extra
precision, while your expl() one has 8-10.  I was going to translate
the expl() kernel to lower precision, but found that the old internal
ones are good enough in both accuracy and efficency for the hyperbolic
functions.  So my hyperbolic functions in double and float precision
are almost identical with the ones in long double precision, except
for details hidden in the kernels.  I only cleaned up the double and
float kernels a little when reorganizing to separate them.  However,
I couldn't get the kernels to work as fast as open coding for exp*()
itself.  clang now optimizes the old exp*() _very_ well so it shouldn't
have any problems turning the kernels into the same code as the open-
coded versions.

The old exp() kernels only have a couple of bits of extra precision,
but this is enough.  i386 now uses the i387 for exp() but not for
expf().  This is slow since it has to switch precisions.  This is
not very accurate even with 11 extra bits of precision, since the
method is sloppy giving and loses about 8 of the extras.  This is not
significantly better than the 2 extra bits given by the kernels.
Except possibly for exp().  If you have an error of 0.76 ulps than
another 0.25 ulps takes the final error above 1.00.  The expl() kernel
shouldn't have this problem since its error is smaller than the exp()
kernel or any adjustments that pow() could reasonably make to its
open-coded version of this.

The float coeffs for P* should be easy to fix since we already did that
if necessary for their copies in expf().

Bruce

From owner-freebsd-numerics@freebsd.org  Sat Dec 16 06:15:12 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 55E58E9EC34
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Sat, 16 Dec 2017 06:15:12 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail109.syd.optusnet.com.au (mail109.syd.optusnet.com.au
 [211.29.132.80])
 by mx1.freebsd.org (Postfix) with ESMTP id CB9AB797A8
 for <freebsd-numerics@freebsd.org>; Sat, 16 Dec 2017 06:15:11 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail109.syd.optusnet.com.au (Postfix) with ESMTPS id 27535D6329A;
 Sat, 16 Dec 2017 17:15:02 +1100 (AEDT)
Date: Sat, 16 Dec 2017 17:15:01 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Bruce Evans <brde@optusnet.com.au>
cc: Steve Kargl <sgk@troutmask.apl.washington.edu>, 
 freebsd-numerics@freebsd.org
Subject: Re: progress on powl.
In-Reply-To: <20171216135205.D971@besplex.bde.org>
Message-ID: <20171216163632.W1413@besplex.bde.org>
References: <20171216013325.GA27344@troutmask.apl.washington.edu>
 <20171216135205.D971@besplex.bde.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=bc8baKHB c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=pWyhuXww-wSbjqJyOqQA:9 a=CjuIK1q_8ugA:10
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Dec 2017 06:15:12 -0000

On Sat, 16 Dec 2017, Bruce Evans wrote:

> ...
> It is suprising that P* is accurate enough for pow().  exp()
> exponentiates errors.  It is apparently enough to calculate y*log2(x)
> = n + y' accurately.  Since |y'| <= 0.5, the errors don't get
> ...
>
> My clog*(z) has similar problems for the real part.  At the end, it must
> calculate log*(x) where x has extra precision.  Unfortunately, only
> the internal kernels of logl() and of my uncommitted version of log()
> and logf() support extra precision.  Since I don't want to bloat clog()
> by open-coding these internals like pow() does for exp()'s internals,
> I just call the log*() functions.  This blows out the error to up to ~1.8
> ulps.

> [... using extra precision of exp*() kernels]

Also, the internal extra precision of log*() might be enough for
y*log2(x) = n+y'.  It is enough for plain log2*(x).  logl() has 8-10
extra bits internally, so log2l() has 6-8.  Except for my versions,
log() and logf() barely have enough bits to lose any for log2() and
log2f().  Probably not enough for pow*().  y' also needs to retain
extra precision, but that is routine using the method in log2l() (it
just discards extra precision at the end).  The 4.4BSD pow() is almost
trivial for non-exceptional args using its kernels:

X 	/* Return exp(y*log(x)), using simulated extended */
X 	/* precision for the log and the multiply.	  */
X 
X 	s = log__D(x);
X 	t.a = y;
X 	TRUNC(t.a);
X 	t.b = y - t.a;
X 	t.b = s.b*y + t.b*s.a;
X 	t.a *= s.a;
X 	s.a = t.a + t.b;
X 	s.b = (t.a - s.a) + t.b;
X 	return (exp__D(s.a, s.b));

while the log2l() non-kernel with debugging and i386 macros removed is:

X long double
X log2l(long double x)
X {
X 	struct ld r;
X 	long double hi, lo;
X 
X 	k_logl(x, &r);
X 	if (!r.lo_set)
X 		return (r.hi);
X 	_2sumF(r.hi, r.lo);
X 	hi = (float)r.hi;
X 	lo = r.lo + (r.hi - hi);
X 	return (invln2_hi * hi +
X 	    (invln2_lo + invln2_hi) * lo + invln2_lo * hi);
X }

You should see _2sumF() open-coded in this pow().  _2sumF() is delicate
and the open-coding is probably broken.  Compiler bugs break most
implementations of 2sum*() if there is extra precision.  _2sumF() has
complications to work around this.  The complications are mostly in the
instructions in the comments for using it.  For double precision, the
the args must be double_t, not double.  The 4.4BSD library doesn't know
anything about this, so its open coding is just broken on i386.  It is
unclear if or how fdlibm pow() avoids these bugs.  We fixed a couple
for exp().  math_private.h has debugging variants of _2sumF() and variants
with less restrictions on use.

pow() also has to do a multiplication for this.  We don't have any macros
like _2sumF() to help for this.  In general, for real values errors for
extra-precision multiplications are unimportant since there are no
cancelations.  Howver a + b*c might need extra precision since there
might be cancelations for the addition.  Not a problem for the final step
of pow*(), but a problem for the y' = y*log2(x) -n step.

Another problem with exp__D() and log__L() is that their API is asymmetric.
log__L() returns extra precision but doesn't take it.  exp__D() takes
extra precision but doesn't return it.  Both as needed for pow() and
perhaps *gamma*(), but too specialized.  General versions were too much
slower in C90, but now inline functions can take and return extra precision
and hope that the compiler optimizes it away when the caller supplies it
as arg.lo = 0 and the callee discards it by not using result.lo.

Bruce

From owner-freebsd-numerics@freebsd.org  Sat Dec 16 16:54:07 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D6DA3E87E49
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Sat, 16 Dec 2017 16:54:07 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id BB3EB6A83F
 for <freebsd-numerics@freebsd.org>; Sat, 16 Dec 2017 16:54:07 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id vBGGs6BW019378
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sat, 16 Dec 2017 08:54:06 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id vBGGs6RH019377;
 Sat, 16 Dec 2017 08:54:06 -0800 (PST) (envelope-from sgk)
Date: Sat, 16 Dec 2017 08:54:06 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: progress on powl.
Message-ID: <20171216165406.GA18979@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20171216013325.GA27344@troutmask.apl.washington.edu>
 <20171216135205.D971@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20171216135205.D971@besplex.bde.org>
User-Agent: Mutt/1.7.2 (2016-11-26)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Dec 2017 16:54:07 -0000

On Sat, Dec 16, 2017 at 03:08:52PM +1100, Bruce Evans wrote:
> On Fri, 15 Dec 2017, Steve Kargl wrote:
> 
> > As expect no one elase is working on powl, I have
> > dusted off my work-in-progress that is now some
> > 1+ year in the works. I've now got a sort of
> > working version in very limited testing.
> > ...
> > I'm still slowing working out some thresholds and
> > few other kinks.  The biggest problem is that
> > src/e_pow.c may be the worse commented code that
> > I've had the pleasure to read.  There are 3 polynomial
> > approximations.  One is easy to work.  The other two
> > are gaint mysteries at moment.
> 

I spoke too soon.  It seems that some of my bit twiddling
has gone off the rails. :(

-- 
Steve