Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Oct 2012 18:11:35 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Warner Losh <imp@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r241756 - head/lib/msun/src
Message-ID:  <20121020164315.Q1095@besplex.bde.org>
In-Reply-To: <201210192247.q9JMljL4093098@svn.freebsd.org>
References:  <201210192247.q9JMljL4093098@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 19 Oct 2012, Warner Losh wrote:

> Log:
>  Document the method used to compute expf.  Taken from exp, with
>  changes to reflect differences in computation between the two.

Please back this out, as for logf.

We are a bit further from replacing fdlibm exp* by better versions
than for log*().  We have a better expl() that is faster and much more
accuract than exp() and almost faster than expf(), but lower-precision
of our expl() implementation are incomplete.

> Modified: head/lib/msun/src/e_expf.c
> ==============================================================================
> --- head/lib/msun/src/e_expf.c	Fri Oct 19 22:46:48 2012	(r241755)
> +++ head/lib/msun/src/e_expf.c	Fri Oct 19 22:47:44 2012	(r241756)
> @@ -21,6 +21,68 @@ __FBSDID("$FreeBSD$");
> #include "math.h"
> #include "math_private.h"
>
> +/* __ieee754_expf

Here the comment matches the code, unlike for logf.

> + * Returns the exponential of x.

Banal comment which does less than echo the code.  The function name tells
us that it returns the exponential of x in float precision.

> ...
> + *   2. Approximation of exp(r) by a special rational function on
> + *      the interval [0,0.34658]:
> + *      Write
> + *          R(r**2) = r*(exp(r)+1)/(exp(r)-1) = 2 + r*r/6 - r**4/360 + ...
> + *      We use a special Remes algorithm on [0,0.34658] to generate
> + *      a polynomial of degree 2 to approximate R. The maximum error
> + *      of this polynomial approximation is bounded by 2**-27. In
> + *      other words,

The last 3 sentences are missing the spelling and punctuation errors
in log*.

The magic 2**-27 is better decomented as 2**-27.74 in my changes,
except possibly for the rounding.

> + *          R(z) ~ 2.0 + P1*z + P2*z*z
> + *      (where z=r*r, and the values of P1 and P2 are listed below)
> + *      and
> + *          |              2          |     -27
> + *          | 2.0+P1*z+P2*z   -  R(z) | <= 2
> + *          |                         |

-27 duplicated again.

This part of the approximation is routine and shouldn't be documented
using verbose pretty-printing.

> + *      The computation of expf(r) thus becomes
> + *                             2*r
> + *             expf(r) = 1 + -------
> + *                            R - r
> + *                                 r*R1(r)
> + *                     = 1 + r + ----------- (for better accuracy)
> + *                                2 - R1(r)
> + *      where
> + *                               2       4
> + *              R1(r) = r - (P1*r  + P2*r)

This part is not routine, so it is good to verbosely pretty-print it.  But
only once.

> + * Accuracy:
> + *      according to an error analysis, the error is always less than
> + *      0.5013 ulp (unit in the last place).

Just wrong.  An error of 2**-27 can't possibly get close to 0.5013.  It
gives a theoretical error of at least 0.5 + 2**(-27 - -24) = 0.625.  We
intentional don't try form more since the algorithm has even larger
fundamental errors elsewhere.  2**-27 wouldn't be good enough if we
couldn't do exhaustive testing to check that the other errors really
do dominate.  Exhaustive testing shows the following errors:

- i386 (old version with gcc -O): 0.5092
- i386 (cur version with clang): <0.5092
- i386 (cur version with gcc -O): 0.5807
- i386 (cur version with clang): <0.5807
- i386 (with gcc -O0 and/or -ffloat-store): larger, much like amd64
- amd64 (al versions):            0.9101

With amd64 or -O0, we just get the larger errors elsewhere.  With the
old version, the errors were smaller due to a bug that normally gave
extra precision accidentally, but which could just give wrong results
in theory, so I fixed it.  The fix unfortunately lost some of the
accidental extra precision.  The bug showed up as clang giving different
accidental extra precision.  It still gives some, but not as much as
before.  The sources could be modified to the more than the accidental
extra precision non-accidentally by using float_t in float precision
and double_t plus FP_PE rounding precision in double precision.  This
requires care with types mainly at the place where non-accidental
extra precision gives bugs unless handled carefully.  Otherwise, the
changes are simply s/float/float_t/ for local variables, etc. plus fixing
float_t for clang...


> + *
> + * Misc. info.
> + *      For IEEE float
> + *          if x >  8.8721679688e+01 then exp(x) overflow
> + *          if x < -1.0397208405e+02 then exp(x) underflow

These constants are better documented in the code that declares them
as o_threshold and u_threshold.  This gives the values, so the values
shouldn't be repeated here.  There are perhaps not quite enough comments
attached to the code.  In s_expl.c, there are larger comments giving the
expressions for calculating these constants.  But the trickiest details
for using these constants are not commented on at all.  For efficiency,
classification involves a mixture of fuzzy classification using bits and
precise classification using these constants

> + *
> + * Constants:
> + * The hexadecimal values are the intended ones for the following
> + * constants. The decimal values may be used, provided that the
> + * compiler will convert from decimal to binary accurately enough
> + * to produce the hexadecimal values shown.
> + */

Boilerplate FUD for pre-C90 duplicated ad nauseum more than before.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121020164315.Q1095>