Date: Sun, 21 Oct 2012 21:08:49 -0600 From: Warner Losh <imp@bsdimp.com> To: Bruce Evans <brde@optusnet.com.au> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Warner Losh <imp@freebsd.org> Subject: Re: svn commit: r241755 - head/lib/msun/src Message-ID: <18177777-6EE0-4103-98B0-272EFF98FE96@bsdimp.com> In-Reply-To: <20121020150917.I1095@besplex.bde.org> References: <201210192246.q9JMkm4R092929@svn.freebsd.org> <20121020150917.I1095@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Feel free to fix them however. I added the comments because the = algorithms weren't quite the same... If you have a better way, feel = free to back my stuff out on the way to it. Warner On Oct 19, 2012, at 11:43 PM, Bruce Evans wrote: > On Fri, 19 Oct 2012, Warner Losh wrote: >=20 >> Log: >> Document the methods used to compute logf. Taken and edited from the >> double version, with adaptations for the differences between it and >> the float version. >=20 > Please back this out. >=20 > This was intentionally left out. It is painful to maintain large > comments that should be identical for all precisions. Double = precision > is primary, and only comments that depend on the precision should be > given in other precisions, except for short comments to to right of > coulde whose non-duplication would just give larger diffs. If you > want to duplicate them, then they would often have to be = quadruplicated > in 4 files for the 4 supported precisions: > - double precision > - float precision > - long double with MANT_DIG =3D 64 and 16 other bits > - long double with MANT_DIG =3D 113 and 15 other bits > Only 4 now, but there could easily be variations for long doubles. >=20 > log* will be replaced by my version soon. One of the organizational > things that I'm currently struggling with is how not to duplicate the > comments in them. Currently I duplicate then in 4 files, and don't > quite duplicate then in a 5th file for optimized float precision. > Maintaining them has been very painful. They are about 3 times as = large > as the comments here, and have more and more-delicate = precision-dependent > details, so trimming them is harder than here. They do cover all of > log*(), log1p*(), log2*() and log10*() in the same file, so there is > not as much duplication as for the different e_log*.c files. >=20 > Another organizational problem is how to organize the functions. I > currently have 4 primary interfaces per file (should have a couple > more for kernels), and 1 file for each precision, and currently use > inlines and ifdefs to avoid these inlines, but the inlines are > not properly optimized for all cases of interest, and break debugging > in most cases, so I plan to switch to more macro hackery (like = multiple > #include __FILE__'s with ifdefs to select what is compiled for each > #include) . Even more macro hackery would let me put support for all > precisions in the same files, so duplicated comments would be even = less > useful, but the precision-dependent ones would be even harder to write > and read. >=20 > We are developing inverse trig functions, and handle the comments and > other things by generating code for all precisions from the double > precision case. Large comments are stripped as part of the = generation. > This is easier to write but not so good to read. This is only = feasible > because the code is not very precision-dependent. >=20 >> Modified: head/lib/msun/src/e_logf.c >> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >> --- head/lib/msun/src/e_logf.c Fri Oct 19 22:21:01 2012 = (r241754) >> +++ head/lib/msun/src/e_logf.c Fri Oct 19 22:46:48 2012 = (r241755) >> @@ -19,6 +19,57 @@ __FBSDID("$FreeBSD$"); >> #include "math.h" >> #include "math_private.h" >>=20 >> +/* __ieee754_log(x) >=20 > Here the comment doesn't even match the code. This is = __ieee754_logf(x), > not __ieee754_log(). Technically, this is a comment that depends on = the > precision, so it should be given separately, but the differences for = type > suffixes are especially painful to maintain in comments and especially > useless to document in comments. >=20 >> + * Return the logrithm of x >=20 > This duplicates e_log.c to a fault. Fixing this spelling error would > require touching multiple files. >=20 >> + * >> + * Method : >> + * 1. Argument Reduction: find k and f such that >> + * x =3D 2^k * (1+f), >> + * where sqrt(2)/2 < 1+f < sqrt(2) . >> + * >> + * 2. Approximation of log(1+f). >> + * Let s =3D f/(2+f) ; based on log(1+f) =3D log(1+s) - = log(1-s) >> + * =3D 2s + 2/3 s**3 + 2/5 s**5 + ....., >> + * =3D 2s + s*R >=20 > Nothing new here. I don't want to see it again. >=20 >> + * We use a special Reme algorithm on [0,0.1716] to generate >> + * a polynomial of degree 8 to approximate R The maximum error >> + * of this polynomial approximation is bounded by 2**-34.24. In >> + * other words, >=20 > Lots more spelling and grammar errors. I'm not sure if the correct = spelling > is Remez or Remes, but I'm sure it isn't Reme. One sentence break is > missing a "." and the others are consistently too short. >=20 > The 2**-34.24 number is precision-dependent and is documented in more > completely below using my automatically generated comment that gives a > consistent format. (I haven't merged this improvement back into = e_log.c.) > Now there is duplication even within the same file :-(. >=20 >> + * 2 4 6 8 >> + * R(z) ~ Lg1*s +Lg2*s +Lg3*s +Lg4*s >=20 > The number of coeffs is indeed precision-dependent, but uninteresting. > My constent format for poly coeffs doesn't mention them in comments. > It is enough to spell the constants for them in a consistent way > (consistent across all sources, not just log*). >=20 >> + * (the values of Lg1 to Lg7 are listed in the program) >=20 > Here the comment doesn't even match the code. There is no Lg5 to > Lg7 for float precision. >=20 >> + * and >> + * | 2 8 | -34.24 >> + * | Lg1*s +...+Lg4*s - R(z) | <=3D 2 >> + * | | >=20 > Another excessively verbose comment duplicated. It has the magic = number > -34.24 again, so this number is now triplicated. >=20 >> + * Note that 2s =3D f - s*f =3D f - hfsq + s*hfsq, where hfsq =3D= f*f/2. >> + * In order to guarantee error in log below 1ulp, we compute = log >> + * by >> + * log(1+f) =3D f - s*(f - R) (if f is not too = large) >> + * log(1+f) =3D f - (hfsq - s*(hfsq+R)). (better = accuracy) >> + * >> + * 3. Finally, log(x) =3D k*ln2 + log(1+f). >> + * =3D = k*ln2_hi+(f-(hfsq-(s*(hfsq+R)+k*ln2_lo))) >> + * Here ln2 is split into two floating point number: >> + * ln2_hi + ln2_lo, >=20 > Nothing new here. I don't want to see it again. >=20 >> + * where n*ln2_hi is always exact for |n| < 2000. >=20 > Now just wrong. 2000 is for double precision. 2048 would be better = there. > It is what we arrange for, to go slightly above 1024.) Here we need = to > go slightly above 128, and arrange for < 256. This is one of the = points > documented better in my log*. Its documentation belongs closer to the > declaration of ln2_hi where you can see the bits being left out of it. > But my comments for this grew too large. They describe complications = like > using the same splitting for double precision as for long double = precision > (go slightly above 16384 instead of slightly abive 1024, at an = insignificant > cost of precision), and sharing ln2_hi/lo with a table value where = even > more bits sometimes have to be left out of ln2_hi, depending on = complicated > options and the precision. They described the details of why the = number > of bits chosen is needed, and gave the acceptable ranges but have been > reduced a bit closer to saying something like "ln2_hi is rounded to a > number of bits that works [to satisfy constraints explained = elsewhere". >=20 >> + * >> + * Special cases: >> + * log(x) is NaN with signal if x < 0 (including -INF) ; >> + * log(+INF) is +INF; log(0) is -INF with signal; >> + * log(NaN) is that NaN with no signal. >=20 > Nothing new here. I don't want to see it again. >=20 >> + * >> + * Accuracy: >> + * according to an error analysis, the error is always less = than >> + * 1 ulp (unit in the last place). >=20 > Actually, expf() has been exhaustively tested, so its error is known > much more precisely than the error analysis can give. The details are > too MD and unimportant to give here. >=20 > e_log.c is not the place to describe what an ulp is. Garbage should > be removed before duplicating things. >=20 >> + * >> + * Constants: >> + * The hexadecimal values are the intended ones for the following >> + * constants. The decimal values may be used, provided that the >> + * compiler will convert from decimal to binary accurately enough >> + * to produce the hexadecimal values shown. >> + */ >=20 > More non-useful boilerplate. This comment is FUD about pre-C90 = compilers, > since compilers couldn't be trusted to round correctly long ago. = After > C99, we could use hex constants to reduce rounding problems, but I = don't > want to churn the sources for this, and prefer decimal constants in = most > cases, with hex values in comments (mainly for debugging -- hex = constants > tend to be more usefull iff you are looking at hex values in a = register). >=20 >> + >> static const float >> ln2_hi =3D 6.9313812256e-01, /* 0x3f317180 */ >> ln2_lo =3D 9.0580006145e-06, /* 0x3717f7d1 */ >>=20 >=20 > This still uses fdlibm's old format in which the hex values are given = as > pseudo-literals in hex in comments. The couldn't use C99 hex literals > because C99 didn't exist. They are more readable in this form, since > C99 hex literals handle exponents poorly, so it is hard to see what = they > are in bits and you just can't express the special cases where hex is > most needed (Infs and NaNs) as C99 hex literals. But when I optimized > and otherwise improved the LgN* contants, I had not settled on a = format > and didn't even translate to the above format like I did in some other > places, so my LgN* declarations are of the form '<C99 literal in hex>; > /* <decimal constant> */'. >=20 > I was also sloppy with the -34.24 constant when I did that. It seems > to be rounded in the wrong direction. I now get -34.22 (rounded to > nearest) and round it in a safe direction to -34.2. Note that this > constant doesn't quite allow the accuracy in the poly to be read off > as 34.2 bits. It is scaled relative to s, but the accuracy in bits > should be scaled relative to |log(1+s)-log(1-s)| rounded down to a > power of 2. This is ~2*s. Normally for such estimates, scaling by > s is off by a factor of at most 2 so we can read off an accuracy of > 33.2 bits, but here the factor of 2 in 2*s means that we can read off > an accuracy of 33.2 +- 1 bits. Anyway, the -34.24 has lots of spare > bits, so its minor inaccuracies don't matter and can be be found by > exhaustive checking. But it is strictly wrong when its value is given > in constants. e_log.c gives the related bound of 2**-58.45. This is > scaled by s, so it takes minor interpretation too. It seems to be > correctly rounded. >=20 > Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?18177777-6EE0-4103-98B0-272EFF98FE96>