From owner-svn-src-head@FreeBSD.ORG Sat Oct 20 05:43:20 2012 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7C14E58B; Sat, 20 Oct 2012 05:43:20 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail13.syd.optusnet.com.au (mail13.syd.optusnet.com.au [211.29.132.194]) by mx1.freebsd.org (Postfix) with ESMTP id 24F498FC08; Sat, 20 Oct 2012 05:43:18 +0000 (UTC) Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26]) by mail13.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q9K5h6C9027918 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 20 Oct 2012 16:43:09 +1100 Date: Sat, 20 Oct 2012 16:43:06 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Warner Losh Subject: Re: svn commit: r241755 - head/lib/msun/src In-Reply-To: <201210192246.q9JMkm4R092929@svn.freebsd.org> Message-ID: <20121020150917.I1095@besplex.bde.org> References: <201210192246.q9JMkm4R092929@svn.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-Cloudmark-Score: 0 X-Optus-Cloudmark-Analysis: v=2.0 cv=eLtLFwV1 c=1 sm=1 a=xI64QrV9ptIA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=Aet6fyW9sl8A:10 a=cvw0RnjcknDQsnrTvV4A:9 a=CjuIK1q_8ugA:10 a=V4IsmBEFW7KwAuKY:21 a=eyMMwmjJ-5K38dzR:21 a=bxQHXO5Py4tHmhUgaywp5w==:117 Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Oct 2012 05:43:20 -0000 On Fri, 19 Oct 2012, Warner Losh wrote: > Log: > Document the methods used to compute logf. Taken and edited from the > double version, with adaptations for the differences between it and > the float version. Please back this out. This was intentionally left out. It is painful to maintain large comments that should be identical for all precisions. Double precision is primary, and only comments that depend on the precision should be given in other precisions, except for short comments to to right of coulde whose non-duplication would just give larger diffs. If you want to duplicate them, then they would often have to be quadruplicated in 4 files for the 4 supported precisions: - double precision - float precision - long double with MANT_DIG = 64 and 16 other bits - long double with MANT_DIG = 113 and 15 other bits Only 4 now, but there could easily be variations for long doubles. log* will be replaced by my version soon. One of the organizational things that I'm currently struggling with is how not to duplicate the comments in them. Currently I duplicate then in 4 files, and don't quite duplicate then in a 5th file for optimized float precision. Maintaining them has been very painful. They are about 3 times as large as the comments here, and have more and more-delicate precision-dependent details, so trimming them is harder than here. They do cover all of log*(), log1p*(), log2*() and log10*() in the same file, so there is not as much duplication as for the different e_log*.c files. Another organizational problem is how to organize the functions. I currently have 4 primary interfaces per file (should have a couple more for kernels), and 1 file for each precision, and currently use inlines and ifdefs to avoid these inlines, but the inlines are not properly optimized for all cases of interest, and break debugging in most cases, so I plan to switch to more macro hackery (like multiple #include __FILE__'s with ifdefs to select what is compiled for each #include) . Even more macro hackery would let me put support for all precisions in the same files, so duplicated comments would be even less useful, but the precision-dependent ones would be even harder to write and read. We are developing inverse trig functions, and handle the comments and other things by generating code for all precisions from the double precision case. Large comments are stripped as part of the generation. This is easier to write but not so good to read. This is only feasible because the code is not very precision-dependent. > Modified: head/lib/msun/src/e_logf.c > ============================================================================== > --- head/lib/msun/src/e_logf.c Fri Oct 19 22:21:01 2012 (r241754) > +++ head/lib/msun/src/e_logf.c Fri Oct 19 22:46:48 2012 (r241755) > @@ -19,6 +19,57 @@ __FBSDID("$FreeBSD$"); > #include "math.h" > #include "math_private.h" > > +/* __ieee754_log(x) Here the comment doesn't even match the code. This is __ieee754_logf(x), not __ieee754_log(). Technically, this is a comment that depends on the precision, so it should be given separately, but the differences for type suffixes are especially painful to maintain in comments and especially useless to document in comments. > + * Return the logrithm of x This duplicates e_log.c to a fault. Fixing this spelling error would require touching multiple files. > + * > + * Method : > + * 1. Argument Reduction: find k and f such that > + * x = 2^k * (1+f), > + * where sqrt(2)/2 < 1+f < sqrt(2) . > + * > + * 2. Approximation of log(1+f). > + * Let s = f/(2+f) ; based on log(1+f) = log(1+s) - log(1-s) > + * = 2s + 2/3 s**3 + 2/5 s**5 + ....., > + * = 2s + s*R Nothing new here. I don't want to see it again. > + * We use a special Reme algorithm on [0,0.1716] to generate > + * a polynomial of degree 8 to approximate R The maximum error > + * of this polynomial approximation is bounded by 2**-34.24. In > + * other words, Lots more spelling and grammar errors. I'm not sure if the correct spelling is Remez or Remes, but I'm sure it isn't Reme. One sentence break is missing a "." and the others are consistently too short. The 2**-34.24 number is precision-dependent and is documented in more completely below using my automatically generated comment that gives a consistent format. (I haven't merged this improvement back into e_log.c.) Now there is duplication even within the same file :-(. > + * 2 4 6 8 > + * R(z) ~ Lg1*s +Lg2*s +Lg3*s +Lg4*s The number of coeffs is indeed precision-dependent, but uninteresting. My constent format for poly coeffs doesn't mention them in comments. It is enough to spell the constants for them in a consistent way (consistent across all sources, not just log*). > + * (the values of Lg1 to Lg7 are listed in the program) Here the comment doesn't even match the code. There is no Lg5 to Lg7 for float precision. > + * and > + * | 2 8 | -34.24 > + * | Lg1*s +...+Lg4*s - R(z) | <= 2 > + * | | Another excessively verbose comment duplicated. It has the magic number -34.24 again, so this number is now triplicated. > + * Note that 2s = f - s*f = f - hfsq + s*hfsq, where hfsq = f*f/2. > + * In order to guarantee error in log below 1ulp, we compute log > + * by > + * log(1+f) = f - s*(f - R) (if f is not too large) > + * log(1+f) = f - (hfsq - s*(hfsq+R)). (better accuracy) > + * > + * 3. Finally, log(x) = k*ln2 + log(1+f). > + * = k*ln2_hi+(f-(hfsq-(s*(hfsq+R)+k*ln2_lo))) > + * Here ln2 is split into two floating point number: > + * ln2_hi + ln2_lo, Nothing new here. I don't want to see it again. > + * where n*ln2_hi is always exact for |n| < 2000. Now just wrong. 2000 is for double precision. 2048 would be better there. It is what we arrange for, to go slightly above 1024.) Here we need to go slightly above 128, and arrange for < 256. This is one of the points documented better in my log*. Its documentation belongs closer to the declaration of ln2_hi where you can see the bits being left out of it. But my comments for this grew too large. They describe complications like using the same splitting for double precision as for long double precision (go slightly above 16384 instead of slightly abive 1024, at an insignificant cost of precision), and sharing ln2_hi/lo with a table value where even more bits sometimes have to be left out of ln2_hi, depending on complicated options and the precision. They described the details of why the number of bits chosen is needed, and gave the acceptable ranges but have been reduced a bit closer to saying something like "ln2_hi is rounded to a number of bits that works [to satisfy constraints explained elsewhere". > + * > + * Special cases: > + * log(x) is NaN with signal if x < 0 (including -INF) ; > + * log(+INF) is +INF; log(0) is -INF with signal; > + * log(NaN) is that NaN with no signal. Nothing new here. I don't want to see it again. > + * > + * Accuracy: > + * according to an error analysis, the error is always less than > + * 1 ulp (unit in the last place). Actually, expf() has been exhaustively tested, so its error is known much more precisely than the error analysis can give. The details are too MD and unimportant to give here. e_log.c is not the place to describe what an ulp is. Garbage should be removed before duplicating things. > + * > + * Constants: > + * The hexadecimal values are the intended ones for the following > + * constants. The decimal values may be used, provided that the > + * compiler will convert from decimal to binary accurately enough > + * to produce the hexadecimal values shown. > + */ More non-useful boilerplate. This comment is FUD about pre-C90 compilers, since compilers couldn't be trusted to round correctly long ago. After C99, we could use hex constants to reduce rounding problems, but I don't want to churn the sources for this, and prefer decimal constants in most cases, with hex values in comments (mainly for debugging -- hex constants tend to be more usefull iff you are looking at hex values in a register). > + > static const float > ln2_hi = 6.9313812256e-01, /* 0x3f317180 */ > ln2_lo = 9.0580006145e-06, /* 0x3717f7d1 */ > This still uses fdlibm's old format in which the hex values are given as pseudo-literals in hex in comments. The couldn't use C99 hex literals because C99 didn't exist. They are more readable in this form, since C99 hex literals handle exponents poorly, so it is hard to see what they are in bits and you just can't express the special cases where hex is most needed (Infs and NaNs) as C99 hex literals. But when I optimized and otherwise improved the LgN* contants, I had not settled on a format and didn't even translate to the above format like I did in some other places, so my LgN* declarations are of the form '; /* */'. I was also sloppy with the -34.24 constant when I did that. It seems to be rounded in the wrong direction. I now get -34.22 (rounded to nearest) and round it in a safe direction to -34.2. Note that this constant doesn't quite allow the accuracy in the poly to be read off as 34.2 bits. It is scaled relative to s, but the accuracy in bits should be scaled relative to |log(1+s)-log(1-s)| rounded down to a power of 2. This is ~2*s. Normally for such estimates, scaling by s is off by a factor of at most 2 so we can read off an accuracy of 33.2 bits, but here the factor of 2 in 2*s means that we can read off an accuracy of 33.2 +- 1 bits. Anyway, the -34.24 has lots of spare bits, so its minor inaccuracies don't matter and can be be found by exhaustive checking. But it is strictly wrong when its value is given in constants. e_log.c gives the related bound of 2**-58.45. This is scaled by s, so it takes minor interpretation too. It seems to be correctly rounded. Bruce