Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 21 Oct 2012 21:08:49 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Warner Losh <imp@freebsd.org>
Subject:   Re: svn commit: r241755 - head/lib/msun/src
Message-ID:  <18177777-6EE0-4103-98B0-272EFF98FE96@bsdimp.com>
In-Reply-To: <20121020150917.I1095@besplex.bde.org>
References:  <201210192246.q9JMkm4R092929@svn.freebsd.org> <20121020150917.I1095@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Feel free to fix them however.  I added the comments because the =
algorithms weren't quite the same...  If you have a better way, feel =
free to back my stuff out on the way to it.

Warner

On Oct 19, 2012, at 11:43 PM, Bruce Evans wrote:

> On Fri, 19 Oct 2012, Warner Losh wrote:
>=20
>> Log:
>> Document the methods used to compute logf. Taken and edited from the
>> double version, with adaptations for the differences between it and
>> the float version.
>=20
> Please back this out.
>=20
> This was intentionally left out.  It is painful to maintain large
> comments that should be identical for all precisions.  Double =
precision
> is primary, and only comments that depend on the precision should be
> given in other precisions, except for short comments to to right of
> coulde whose non-duplication would just give larger diffs.  If you
> want to duplicate them, then they would often have to be =
quadruplicated
> in 4 files for the 4 supported precisions:
> - double precision
> - float precision
> - long double with MANT_DIG =3D 64 and 16 other bits
> - long double with MANT_DIG =3D 113 and 15 other bits
> Only 4 now, but there could easily be variations for long doubles.
>=20
> log* will be replaced by my version soon.  One of the organizational
> things that I'm currently struggling with is how not to duplicate the
> comments in them.  Currently I duplicate then in 4 files, and don't
> quite duplicate then in a 5th file for optimized float precision.
> Maintaining them has been very painful.  They are about 3 times as =
large
> as the comments here, and have more and more-delicate =
precision-dependent
> details, so trimming them is harder than here.  They do cover all of
> log*(), log1p*(), log2*() and log10*() in the same file, so there is
> not as much duplication as for the different e_log*.c files.
>=20
> Another organizational problem is how to organize the functions.  I
> currently have 4 primary interfaces per file (should have a couple
> more for kernels), and 1 file for each precision, and currently use
> inlines and ifdefs to avoid these inlines, but the inlines are
> not properly optimized for all cases of interest, and break debugging
> in most cases, so I plan to switch to more macro hackery (like =
multiple
> #include __FILE__'s with ifdefs to select what is compiled for each
> #include) .  Even more macro hackery would let me put support for all
> precisions in the same files, so duplicated comments would be even =
less
> useful, but the precision-dependent ones would be even harder to write
> and read.
>=20
> We are developing inverse trig functions, and handle the comments and
> other things by generating code for all precisions from the double
> precision case.  Large comments are stripped as part of the =
generation.
> This is easier to write but not so good to read.  This is only =
feasible
> because the code is not very precision-dependent.
>=20
>> Modified: head/lib/msun/src/e_logf.c
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- head/lib/msun/src/e_logf.c	Fri Oct 19 22:21:01 2012	=
(r241754)
>> +++ head/lib/msun/src/e_logf.c	Fri Oct 19 22:46:48 2012	=
(r241755)
>> @@ -19,6 +19,57 @@ __FBSDID("$FreeBSD$");
>> #include "math.h"
>> #include "math_private.h"
>>=20
>> +/* __ieee754_log(x)
>=20
> Here the comment doesn't even match the code.  This is =
__ieee754_logf(x),
> not __ieee754_log().  Technically, this is a comment that depends on =
the
> precision, so it should be given separately, but the differences for =
type
> suffixes are especially painful to maintain in comments and especially
> useless to document in comments.
>=20
>> + * Return the logrithm of x
>=20
> This duplicates e_log.c to a fault.  Fixing this spelling error would
> require touching multiple files.
>=20
>> + *
>> + * Method :
>> + *   1. Argument Reduction: find k and f such that
>> + *                      x =3D 2^k * (1+f),
>> + *         where  sqrt(2)/2 < 1+f < sqrt(2) .
>> + *
>> + *   2. Approximation of log(1+f).
>> + *      Let s =3D f/(2+f) ; based on log(1+f) =3D log(1+s) - =
log(1-s)
>> + *               =3D 2s + 2/3 s**3 + 2/5 s**5 + .....,
>> + *               =3D 2s + s*R
>=20
> Nothing new here.  I don't want to see it again.
>=20
>> + *      We use a special Reme algorithm on [0,0.1716] to generate
>> + *      a polynomial of degree 8 to approximate R The maximum error
>> + *      of this polynomial approximation is bounded by 2**-34.24. In
>> + *      other words,
>=20
> Lots more spelling and grammar errors.  I'm not sure if the correct =
spelling
> is Remez or Remes, but I'm sure it isn't Reme.  One sentence break is
> missing a "." and the others are consistently too short.
>=20
> The 2**-34.24 number is precision-dependent and is documented in more
> completely below using my automatically generated comment that gives a
> consistent format.  (I haven't merged this improvement back into =
e_log.c.)
> Now there is duplication even within the same file :-(.
>=20
>> + *                      2      4      6      8
>> + *          R(z) ~ Lg1*s +Lg2*s +Lg3*s +Lg4*s
>=20
> The number of coeffs is indeed precision-dependent, but uninteresting.
> My constent format for poly coeffs doesn't mention them in comments.
> It is enough to spell the constants for them in a consistent way
> (consistent across all sources, not just log*).
>=20
>> + *      (the values of Lg1 to Lg7 are listed in the program)
>=20
> Here the comment doesn't even match the code.  There is no Lg5 to
> Lg7 for float precision.
>=20
>> + *      and
>> + *          |      2          8           |     -34.24
>> + *          | Lg1*s +...+Lg4*s    -  R(z) | <=3D 2
>> + *          |                             |
>=20
> Another excessively verbose comment duplicated.  It has the magic =
number
> -34.24 again, so this number is now triplicated.
>=20
>> + *      Note that 2s =3D f - s*f =3D f - hfsq + s*hfsq, where hfsq =3D=
 f*f/2.
>> + *      In order to guarantee error in log below 1ulp, we compute =
log
>> + *      by
>> + *              log(1+f) =3D f - s*(f - R)        (if f is not too =
large)
>> + *              log(1+f) =3D f - (hfsq - s*(hfsq+R)).     (better =
accuracy)
>> + *
>> + *      3. Finally,  log(x) =3D k*ln2 + log(1+f).
>> + *                          =3D =
k*ln2_hi+(f-(hfsq-(s*(hfsq+R)+k*ln2_lo)))
>> + *         Here ln2 is split into two floating point number:
>> + *                      ln2_hi + ln2_lo,
>=20
> Nothing new here.  I don't want to see it again.
>=20
>> + *         where n*ln2_hi is always exact for |n| < 2000.
>=20
> Now just wrong.  2000 is for double precision.  2048 would be better =
there.
> It is what we arrange for, to go slightly above 1024.)  Here we need =
to
> go slightly above 128, and arrange for < 256.  This is one of the =
points
> documented better in my log*.  Its documentation belongs closer to the
> declaration of ln2_hi where you can see the bits being left out of it.
> But my comments for this grew too large.  They describe complications =
like
> using the same splitting for double precision as for long double =
precision
> (go slightly above 16384 instead of slightly abive 1024, at an =
insignificant
> cost of precision), and sharing ln2_hi/lo with a table value where =
even
> more bits sometimes have to be left out of ln2_hi, depending on =
complicated
> options and the precision.  They described the details of why the =
number
> of bits chosen is needed, and gave the acceptable ranges but have been
> reduced a bit closer to saying something like "ln2_hi is rounded to a
> number of bits that works [to satisfy constraints explained =
elsewhere".
>=20
>> + *
>> + * Special cases:
>> + *      log(x) is NaN with signal if x < 0 (including -INF) ;
>> + *      log(+INF) is +INF; log(0) is -INF with signal;
>> + *      log(NaN) is that NaN with no signal.
>=20
> Nothing new here.  I don't want to see it again.
>=20
>> + *
>> + * Accuracy:
>> + *      according to an error analysis, the error is always less =
than
>> + *      1 ulp (unit in the last place).
>=20
> Actually, expf() has been exhaustively tested, so its error is known
> much more precisely than the error analysis can give.  The details are
> too MD and unimportant to give here.
>=20
> e_log.c is not the place to describe what an ulp is.  Garbage should
> be removed before duplicating things.
>=20
>> + *
>> + * Constants:
>> + * The hexadecimal values are the intended ones for the following
>> + * constants. The decimal values may be used, provided that the
>> + * compiler will convert from decimal to binary accurately enough
>> + * to produce the hexadecimal values shown.
>> + */
>=20
> More non-useful boilerplate.  This comment is FUD about pre-C90 =
compilers,
> since compilers couldn't be trusted to round correctly long ago.  =
After
> C99, we could use hex constants to reduce rounding problems, but I =
don't
> want to churn the sources for this, and prefer decimal constants in =
most
> cases, with hex values in comments (mainly for debugging -- hex =
constants
> tend to be more usefull iff you are looking at hex values in a =
register).
>=20
>> +
>> static const float
>> ln2_hi =3D   6.9313812256e-01,	/* 0x3f317180 */
>> ln2_lo =3D   9.0580006145e-06,	/* 0x3717f7d1 */
>>=20
>=20
> This still uses fdlibm's old format in which the hex values are given =
as
> pseudo-literals in hex in comments.  The couldn't use C99 hex literals
> because C99 didn't exist.  They are more readable in this form, since
> C99 hex literals handle exponents poorly, so it is hard to see what =
they
> are in bits and you just can't express the special cases where hex is
> most needed (Infs and NaNs) as C99 hex literals.  But when I optimized
> and otherwise improved the LgN* contants, I had not settled on a =
format
> and didn't even translate to the above format like I did in some other
> places, so my LgN* declarations are of the form '<C99 literal in hex>;
> /* <decimal constant> */'.
>=20
> I was also sloppy with the -34.24 constant when I did that.  It seems
> to be rounded in the wrong direction.  I now get -34.22 (rounded to
> nearest) and round it in a safe direction to -34.2.  Note that this
> constant doesn't quite allow the accuracy in the poly to be read off
> as 34.2 bits.  It is scaled relative to s, but the accuracy in bits
> should be scaled relative to |log(1+s)-log(1-s)| rounded down to a
> power of 2.  This is ~2*s.  Normally for such estimates, scaling by
> s is off by a factor of at most 2 so we can read off an accuracy of
> 33.2 bits, but here the factor of 2 in 2*s means that we can read off
> an accuracy of 33.2 +- 1 bits.  Anyway, the -34.24 has lots of spare
> bits, so its minor inaccuracies don't matter and can be be found by
> exhaustive checking.  But it is strictly wrong when its value is given
> in constants.  e_log.c gives the related bound of 2**-58.45.  This is
> scaled by s, so it takes minor interpretation too.  It seems to be
> correctly rounded.
>=20
> Bruce




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?18177777-6EE0-4103-98B0-272EFF98FE96>