From owner-svn-src-all@FreeBSD.ORG  Sat Oct 20 07:18:34 2012
Return-Path: <owner-svn-src-all@FreeBSD.ORG>
Delivered-To: svn-src-all@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id C17926D4;
 Sat, 20 Oct 2012 07:18:34 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from fallbackmx07.syd.optusnet.com.au
 (fallbackmx07.syd.optusnet.com.au [211.29.132.9])
 by mx1.freebsd.org (Postfix) with ESMTP id DEE238FC12;
 Sat, 20 Oct 2012 07:18:33 +0000 (UTC)
Received: from mail03.syd.optusnet.com.au (mail03.syd.optusnet.com.au
 [211.29.132.184])
 by fallbackmx07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
 q9K7BjlW015608; Sat, 20 Oct 2012 18:11:45 +1100
Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au
 (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26])
 by mail03.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q9K7BZSx028185
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Sat, 20 Oct 2012 18:11:36 +1100
Date: Sat, 20 Oct 2012 18:11:35 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Warner Losh <imp@freebsd.org>
Subject: Re: svn commit: r241756 - head/lib/msun/src
In-Reply-To: <201210192247.q9JMljL4093098@svn.freebsd.org>
Message-ID: <20121020164315.Q1095@besplex.bde.org>
References: <201210192247.q9JMljL4093098@svn.freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-Cloudmark-Score: 0
X-Optus-Cloudmark-Analysis: v=2.0 cv=fLlhRume c=1 sm=1 a=hNMMERRCccEA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=Aet6fyW9sl8A:10
 a=cYv7yatJ9UkkMbrVvNAA:9 a=CjuIK1q_8ugA:10 a=bxQHXO5Py4tHmhUgaywp5w==:117
Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org,
 src-committers@freebsd.org
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
 user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Oct 2012 07:18:34 -0000

On Fri, 19 Oct 2012, Warner Losh wrote:

> Log:
>  Document the method used to compute expf.  Taken from exp, with
>  changes to reflect differences in computation between the two.

Please back this out, as for logf.

We are a bit further from replacing fdlibm exp* by better versions
than for log*().  We have a better expl() that is faster and much more
accuract than exp() and almost faster than expf(), but lower-precision
of our expl() implementation are incomplete.

> Modified: head/lib/msun/src/e_expf.c
> ==============================================================================
> --- head/lib/msun/src/e_expf.c	Fri Oct 19 22:46:48 2012	(r241755)
> +++ head/lib/msun/src/e_expf.c	Fri Oct 19 22:47:44 2012	(r241756)
> @@ -21,6 +21,68 @@ __FBSDID("$FreeBSD$");
> #include "math.h"
> #include "math_private.h"
>
> +/* __ieee754_expf

Here the comment matches the code, unlike for logf.

> + * Returns the exponential of x.

Banal comment which does less than echo the code.  The function name tells
us that it returns the exponential of x in float precision.

> ...
> + *   2. Approximation of exp(r) by a special rational function on
> + *      the interval [0,0.34658]:
> + *      Write
> + *          R(r**2) = r*(exp(r)+1)/(exp(r)-1) = 2 + r*r/6 - r**4/360 + ...
> + *      We use a special Remes algorithm on [0,0.34658] to generate
> + *      a polynomial of degree 2 to approximate R. The maximum error
> + *      of this polynomial approximation is bounded by 2**-27. In
> + *      other words,

The last 3 sentences are missing the spelling and punctuation errors
in log*.

The magic 2**-27 is better decomented as 2**-27.74 in my changes,
except possibly for the rounding.

> + *          R(z) ~ 2.0 + P1*z + P2*z*z
> + *      (where z=r*r, and the values of P1 and P2 are listed below)
> + *      and
> + *          |              2          |     -27
> + *          | 2.0+P1*z+P2*z   -  R(z) | <= 2
> + *          |                         |

-27 duplicated again.

This part of the approximation is routine and shouldn't be documented
using verbose pretty-printing.

> + *      The computation of expf(r) thus becomes
> + *                             2*r
> + *             expf(r) = 1 + -------
> + *                            R - r
> + *                                 r*R1(r)
> + *                     = 1 + r + ----------- (for better accuracy)
> + *                                2 - R1(r)
> + *      where
> + *                               2       4
> + *              R1(r) = r - (P1*r  + P2*r)

This part is not routine, so it is good to verbosely pretty-print it.  But
only once.

> + * Accuracy:
> + *      according to an error analysis, the error is always less than
> + *      0.5013 ulp (unit in the last place).

Just wrong.  An error of 2**-27 can't possibly get close to 0.5013.  It
gives a theoretical error of at least 0.5 + 2**(-27 - -24) = 0.625.  We
intentional don't try form more since the algorithm has even larger
fundamental errors elsewhere.  2**-27 wouldn't be good enough if we
couldn't do exhaustive testing to check that the other errors really
do dominate.  Exhaustive testing shows the following errors:

- i386 (old version with gcc -O): 0.5092
- i386 (cur version with clang): <0.5092
- i386 (cur version with gcc -O): 0.5807
- i386 (cur version with clang): <0.5807
- i386 (with gcc -O0 and/or -ffloat-store): larger, much like amd64
- amd64 (al versions):            0.9101

With amd64 or -O0, we just get the larger errors elsewhere.  With the
old version, the errors were smaller due to a bug that normally gave
extra precision accidentally, but which could just give wrong results
in theory, so I fixed it.  The fix unfortunately lost some of the
accidental extra precision.  The bug showed up as clang giving different
accidental extra precision.  It still gives some, but not as much as
before.  The sources could be modified to the more than the accidental
extra precision non-accidentally by using float_t in float precision
and double_t plus FP_PE rounding precision in double precision.  This
requires care with types mainly at the place where non-accidental
extra precision gives bugs unless handled carefully.  Otherwise, the
changes are simply s/float/float_t/ for local variables, etc. plus fixing
float_t for clang...


> + *
> + * Misc. info.
> + *      For IEEE float
> + *          if x >  8.8721679688e+01 then exp(x) overflow
> + *          if x < -1.0397208405e+02 then exp(x) underflow

These constants are better documented in the code that declares them
as o_threshold and u_threshold.  This gives the values, so the values
shouldn't be repeated here.  There are perhaps not quite enough comments
attached to the code.  In s_expl.c, there are larger comments giving the
expressions for calculating these constants.  But the trickiest details
for using these constants are not commented on at all.  For efficiency,
classification involves a mixture of fuzzy classification using bits and
precise classification using these constants

> + *
> + * Constants:
> + * The hexadecimal values are the intended ones for the following
> + * constants. The decimal values may be used, provided that the
> + * compiler will convert from decimal to binary accurately enough
> + * to produce the hexadecimal values shown.
> + */

Boilerplate FUD for pre-C90 duplicated ad nauseum more than before.

Bruce