From owner-freebsd-bugs@FreeBSD.ORG Tue Nov 23 23:50:10 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 25B56106566B for ; Tue, 23 Nov 2010 23:50:10 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id E30158FC0A for ; Tue, 23 Nov 2010 23:50:09 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oANNo9HE073388 for ; Tue, 23 Nov 2010 23:50:09 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id oANNo9g4073387; Tue, 23 Nov 2010 23:50:09 GMT (envelope-from gnats) Date: Tue, 23 Nov 2010 23:50:09 GMT Message-Id: <201011232350.oANNo9g4073387@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Alexander Best Cc: Subject: Re: kern/133583: [libm] fma(3) does not respect rounding mode using extended precision X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Alexander Best List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Nov 2010 23:50:10 -0000 The following reply was made to PR kern/133583; it has been noted by GNATS. From: Alexander Best To: bug-followup@freebsd.org Cc: Subject: Re: kern/133583: [libm] fma(3) does not respect rounding mode using extended precision Date: Tue, 23 Nov 2010 23:40:18 +0000 this is what bruce evans wrote concerning this issue. cheers. alex ----- Forwarded message from Bruce Evans ----- Date: Tue, 16 Nov 2010 05:44:03 +1100 (EST) From: Bruce Evans To: Alexander Best cc: Bruce Evans , Ulrich Spoerlein , das@freebsfd.org Subject: Re: svn commit: r215237 - head/lib/msun/src [Cc trimmed] On Mon, 15 Nov 2010, Alexander Best wrote: >if you are interested in solving two more msun mysteries, you might want to >have a look at #PR kern/133583 and standards/143358. Here is a quick fix for #133583. % Index: s_fma.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_fma.c,v % retrieving revision 1.5 % diff -u -2 -r1.5 s_fma.c % --- s_fma.c 3 Apr 2008 06:14:51 -0000 1.5 % +++ s_fma.c 15 Nov 2010 17:44:48 -0000 % @@ -170,5 +170,5 @@ % zs = ldexp(zs, -spread); % r = c + zs; % - s = r - c; % + *(volatile *)&s = r - c; % rr = (c - (r - s)) + (zs - s) + cc; % The basic problem is that s_fma.c uses Dekker's algorithm, which assumes working floating point, i.e., not the floating point given by gcc on i387. FreeBSD defaults to rounding to double precision so as to mostly avoid this problem for doubles, and msun tries to and mostly succeeds in avoiding it using direct methods for floats (using STRICT_ASSIGN() and sometimes a more direct volatile cast or variable), but setting extended precision as the test program does exposes all double precision functions to it. This is 1 reason why the regression tests shouldn't configure extended precision for double precision tests (unless they want to test this, and we're not really ready for this since many other double precision functions are known to be broken when extended precision is configured). The problem was obvious since it went away with -O0 and with -ffloat-store. The above patch is the result of sprinkling volatiles around until I found one that worked and then removing the ones that made no difference. Perhaps many more are needed. In particular, the splitting into high and low parts won't work as intended. The test in the PR somehow worked without fixing this, probably just because it uses special args that are already have enough low bits. Note that STRICT_ASSIGN() doesn't work for doubles (is just null), since it is intentionally optimized for the default FreeBSD rounding precision, and using it (when it is not null) is a large pessimization. The following functions in msun/src (and few or none elsewhere) use it: % e_rem_pio2.c: STRICT_ASSIGN(double,fn,x*invpio2+0x1.8p52); % e_rem_pio2f.c: STRICT_ASSIGN(double,fn,x*invpio2+0x1.8p52); % k_rem_pio2.c: STRICT_ASSIGN(double,fw,fw); % s_exp2.c: STRICT_ASSIGN(double, t, x + redux); % s_exp2f.c: STRICT_ASSIGN(float, t, x + redux); % s_log1p.c: STRICT_ASSIGN(double,u,1.0+x); % s_log1pf.c: STRICT_ASSIGN(float,u,(float)1.0+x); % s_rint.c: STRICT_ASSIGN(double,w,TWO52[sx]+x); % s_rint.c: STRICT_ASSIGN(double,w,TWO52[sx]+x); % s_rintf.c: STRICT_ASSIGN(float,w,TWO23[sx]+x); % s_rintf.c: STRICT_ASSIGN(float,w,TWO23[sx]+x); All the double-precision uses here are null and are just clones of the float precision code where the use avoids a problem that occurs in the usual case. Since the double precision uses are null, all the functions with such uses are broken if someone configures extended precision. This includes all trig functions (via *_rem_pio2.c). I have fixed this locally only for the trig functions. In k_rem_pio2_.c, I essentially just use a variant of STRICT_ASSIGN(), but a better way is to add and subtract a bias (something like 0x1.8p53 (52 DBL_MANT_DIG - 1) to push the unwanted bits to oblivion). I use such a bias to avoid the STRICT_ASSIGN() in in s_exp2f.c: % --- s_exp2f.c Fri Feb 22 14:46:43 2008 % +++ z22/s_exp2f.c Fri Feb 22 14:48:37 2008 % @@ -33,14 +33,50 @@ % #include "math_private.h" % % +/* To be moved nearer to . */ % +#if FLT_EVAL_METHOD == 0 % +#define FLT_T_MANT_DIG FLT_MANT_DIG % +#else % +/* % + * XXX this hack works for all supported arches. There is some doubt that % + * float_t actually is double on arm and powerpc as claimed in , % + * but it is certainly not long double, and the reduction only needs % + * float_t to be at least as large as the evaluation precision. % + */ % +#define FLT_T_MANT_DIG DBL_MANT_DIG % +#endif % + % #define TBLBITS 4 % #define TBLSIZE (1 << TBLBITS) % % +/* % + * Ensure that redux fits in a float. float_t would always work, but % + * float works on all supported arches and is more efficient on some. % + */ % +#if FLT_T_MANT_DIG >= FLT_MAX_EXP + TBLBITS % +#error "Unsupported type for float_t" % +#endif % + % static const float % huge = 0x1p100f, % - redux = 0x1.8p23f / TBLSIZE, % + redux = __CONCAT(0x1.8p, FLT_T_MANT_DIG) / 2 / TBLSIZE, % +/* % + * Domain [-0.03125, 0.03125], range ~[-6.015e-11, 5.960e-11] % + * |exp2(x) - p(x)| < 2**-33.95 % + */ % P1 = 0x1.62e430p-1f, % P2 = 0x1.ebfbe0p-3f, % P3 = 0x1.c6b348p-5f, % P4 = 0x1.3b2c9cp-7f; % +#if 0 % +/* % + * Domain [-0.03125, 0.03125], range ~[-2.4881e-12, 2.4881e-12]: % + * |exp2(x) - p(x)| < 2**-38.54 % + */ % +static const double % + P1 = 6.9314718016256749e-1, /* 0x162e42fec39c72.0p-53 */ % + P2 = 2.4022650689211292e-1, /* 0x1ebfbdff5df1dd.0p-55 */ % + P3 = 5.5505736316289703e-2, /* 0x1c6b3f74700eea.0p-57 */ % + P4 = 9.6183434022792860e-3; /* 0x13b2c832d5fd3b.0p-59 */ % +#endif % % static volatile float twom100 = 0x1p-100f; % @@ -69,4 +105,8 @@ % * % * Accuracy: Peak error < 0.501 ulp; location of peak: -0.030110927. % +#if 0 % + * Accuracy: Peak error < 0.50004 ulp; location of peak: -0.00918653142 % + * 5168 cases out of 2**32 are incorrectly rounded in round-to-nearest mode. % +#endif % * % * Method: (equally-spaced tables) % @@ -95,5 +135,5 @@ % { % double tv, twopk, u, z; % - float t; % + float_t t; % uint32_t hx, ix, i0; % int32_t k; % @@ -118,12 +158,18 @@ % % /* Reduce x, computing z, i0, and k. */ % - STRICT_ASSIGN(float, t, x + redux); % + t = x + redux; % +#if FLT_T_MANT_DIG == 24 % GET_FLOAT_WORD(i0, t); % +#elif FLT_T_MANT_DIG == 53 % + GET_LOW_WORD(i0, t); % +#else % +#error "Unsupported type for float_t" % +#endif % ... This also has complications to use float_t. float_t will normally be double or long double, and the bias (redux here) depends on the number of bits in the type used in the redux. The basic idea is to use the natural type for operations (float_t, not float here), so that adding and subtracting biases just works and the correct bias can be known at compile time. This is painful to configure (see above), is broken by anywone configuring extra precision (e.g., the 53 in the above depends on the precision remaining at double). My uncommitted logl() uses a modified subset of Dekker's algorithm without hitting the problems with extra precision or needing to use STRICT_ASSIGN() to avoid them (any extra precision just gives extra precision in the result). I think the additive part doesn't really care about extra precision. It The multicative part cares since it needs the hi*hi multiplications to be exact, so the hi parts must not have extra bits. Bruce ----- End forwarded message ----- -- a13x