Date: Tue, 10 Apr 2012 11:45:18 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: David Schultz <das@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, Tijl Coosemans <tijl@FreeBSD.org>, src-committers@FreeBSD.org, svn-src-all@FreeBSD.org Subject: Re: svn commit: r232491 - in head/sys: amd64/include i386/include pc98/include x86/include Message-ID: <20120410111448.I1081@besplex.bde.org> In-Reply-To: <20120409165335.GA13177@zim.MIT.EDU> References: <201203041400.q24E0WcS037398@svn.freebsd.org> <20120409165335.GA13177@zim.MIT.EDU>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 9 Apr 2012, David Schultz wrote: > On Sun, Mar 04, 2012, Tijl Coosemans wrote: >> Log: >> Copy amd64 float.h to x86 and merge with i386 float.h. Replace >> amd64/i386/pc98 float.h with stubs. > [...] >> --- head/sys/amd64/include/float.h Sun Mar 4 12:52:48 2012 (r232490, copy source) >> +++ head/sys/x86/include/float.h Sun Mar 4 14:00:32 2012 (r232491) >> @@ -42,7 +42,11 @@ __END_DECLS >> #define FLT_RADIX 2 /* b */ >> #define FLT_ROUNDS __flt_rounds() >> #if __ISO_C_VISIBLE >= 1999 >> +#ifdef _LP64 >> #define FLT_EVAL_METHOD 0 /* no promotions */ >> +#else >> +#define FLT_EVAL_METHOD (-1) /* i387 semantics are...interesting */ >> +#endif >> #define DECIMAL_DIG 21 /* max precision in decimal digits */ >> #endif > > The implication of this code is that FLT_EVAL_METHOD depends on > the size of a long, which it does not. Instead, it depends on > whether SSE2 support is guaranteed to be present. If anything, > the test should be something like #ifndef __i386__. Actually, it depends on whether both SSE1 and SSE2 support are guaranteed to be used. The i386 ifdef is wrong too (as is the old fixed value for i386), since clang with SSE support breaks the abstract i386 machine by actually using SSE; with gcc, this breakage is under control of the option -mfpmath=unit which defaults to unit=i387. Also, float_t and double_t must match FLT_EVAL_METHOD. I use the following hack to work around the clang breakage in libm: % Index: math.h % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/math.h,v % retrieving revision 1.82 % diff -u -2 -r1.82 math.h % --- math.h 12 Nov 2011 19:55:48 -0000 1.82 % +++ math.h 4 Jan 2012 05:09:51 -0000 % @@ -125,4 +130,10 @@ % : __signbitl(x)) % % +#ifdef __SSE_MATH__ % +#define __float_t float % +#endif % +#ifdef __SSE2_MATH__ % +#define __double_t double % +#endif % typedef __double_t double_t; % typedef __float_t float_t; I forgot to hack on FLT_EVAL_METHOD similarly. The fixed value of (-1) for i386 is sort of fail-safe, since it says that the evaluation method is indeterminate, so the code must assume the worst. The normal i386 types for float_t and double_t are also sort of fail-safe, since they are larger than necessary. They just cause pessimal code. So would FLT_EVAL_METHOD = -1, and I only hacked on the types since my tests only cover the pessimizations for the types. Note that the compiler builtin __FLT_EVAL_METHOD is unusable, since its value is almost always wrong. With gcc, it is wrong by default (2) but is changed correctly to 0 by -mfpmath=sse. With clang, it is wrong by default (0), but becomes correct with SSE1 and SSE2. With only SSE1, there are even more possibilities for the float evaluation method, but doubles must be evaluated using the i387 so FLT_EVAL_METHOD must remain as -1. Examples: - clang -march=athlon-xp. Athlon-XP only has SSE1, and clang evaluates float expressions using SSE1 but double expressions using i387. This matches float_t = float and double_t = long double given by the above. FLT_EVAL_METHOD = -1 remains correct. - similarly for gcc -march=athlon-xp -mfpmath=sse. - clang -march=athlon64. Athlon64 has both SSE1 and SSE2, and clang evaluates both float and double expressions using SSE*. This matches float_t = float and double_t = double given by the above. FLT_EVAL_METHOD = -1 is now wrong. - similarly for gcc -march=athlon64 -mfpmath=sse. SSE* use can also be controlled by -msse[12] (instead of march), but -mfpmath doesn't distinguish between SSE1 and SSE2, so there seems to be no way to use SSE2 generally and SSE1 for FP without also using SSE2 for FP. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120410111448.I1081>