Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Apr 2012 11:45:18 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        David Schultz <das@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, Tijl Coosemans <tijl@FreeBSD.org>, src-committers@FreeBSD.org, svn-src-all@FreeBSD.org
Subject:   Re: svn commit: r232491 - in head/sys: amd64/include i386/include pc98/include x86/include
Message-ID:  <20120410111448.I1081@besplex.bde.org>
In-Reply-To: <20120409165335.GA13177@zim.MIT.EDU>
References:  <201203041400.q24E0WcS037398@svn.freebsd.org> <20120409165335.GA13177@zim.MIT.EDU>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 9 Apr 2012, David Schultz wrote:

> On Sun, Mar 04, 2012, Tijl Coosemans wrote:
>> Log:
>>   Copy amd64 float.h to x86 and merge with i386 float.h. Replace
>>   amd64/i386/pc98 float.h with stubs.
> [...]
>> --- head/sys/amd64/include/float.h	Sun Mar  4 12:52:48 2012	(r232490, copy source)
>> +++ head/sys/x86/include/float.h	Sun Mar  4 14:00:32 2012	(r232491)
>> @@ -42,7 +42,11 @@ __END_DECLS
>>  #define FLT_RADIX	2		/* b */
>>  #define FLT_ROUNDS	__flt_rounds()
>>  #if __ISO_C_VISIBLE >= 1999
>> +#ifdef _LP64
>>  #define	FLT_EVAL_METHOD	0		/* no promotions */
>> +#else
>> +#define	FLT_EVAL_METHOD	(-1)		/* i387 semantics are...interesting */
>> +#endif
>>  #define	DECIMAL_DIG	21		/* max precision in decimal digits */
>>  #endif
>
> The implication of this code is that FLT_EVAL_METHOD depends on
> the size of a long, which it does not.  Instead, it depends on
> whether SSE2 support is guaranteed to be present.  If anything,
> the test should be something like #ifndef __i386__.

Actually, it depends on whether both SSE1 and SSE2 support are
guaranteed to be used.  The i386 ifdef is wrong too (as is the old
fixed value for i386), since clang with SSE support breaks the abstract
i386 machine by actually using SSE; with gcc, this breakage is under
control of the option -mfpmath=unit which defaults to unit=i387.
Also, float_t and double_t must match FLT_EVAL_METHOD.

I use the following hack to work around the clang breakage in libm:

% Index: math.h
% ===================================================================
% RCS file: /home/ncvs/src/lib/msun/src/math.h,v
% retrieving revision 1.82
% diff -u -2 -r1.82 math.h
% --- math.h	12 Nov 2011 19:55:48 -0000	1.82
% +++ math.h	4 Jan 2012 05:09:51 -0000
% @@ -125,4 +130,10 @@
%      : __signbitl(x))
% 
% +#ifdef __SSE_MATH__
% +#define	__float_t	float
% +#endif
% +#ifdef __SSE2_MATH__
% +#define	__double_t	double
% +#endif
%  typedef	__double_t	double_t;
%  typedef	__float_t	float_t;

I forgot to hack on FLT_EVAL_METHOD similarly.  The fixed value of (-1)
for i386 is sort of fail-safe, since it says that the evaluation method
is indeterminate, so the code must assume the worst.  The normal i386
types for float_t and double_t are also sort of fail-safe, since they
are larger than necessary.  They just cause pessimal code.  So would
FLT_EVAL_METHOD = -1, and I only hacked on the types since my tests
only cover the pessimizations for the types.

Note that the compiler builtin __FLT_EVAL_METHOD is unusable, since its
value is almost always wrong.  With gcc, it is wrong by default (2) but
is changed correctly to 0 by -mfpmath=sse.  With clang, it is wrong
by default (0), but becomes correct with SSE1 and SSE2.  With only SSE1,
there are even more possibilities for the float evaluation method, but
doubles must be evaluated using the i387 so FLT_EVAL_METHOD must remain
as -1.

Examples:
- clang -march=athlon-xp.  Athlon-XP only has SSE1, and clang evaluates
   float expressions using SSE1 but double expressions using i387.  This
   matches float_t = float and double_t = long double given by the above.
   FLT_EVAL_METHOD = -1 remains correct.
- similarly for gcc -march=athlon-xp -mfpmath=sse.
- clang -march=athlon64.  Athlon64 has both SSE1 and SSE2, and clang
   evaluates both float and double expressions using SSE*.  This matches
   float_t = float and double_t = double given by the above.
   FLT_EVAL_METHOD = -1 is now wrong.
- similarly for gcc -march=athlon64 -mfpmath=sse.  SSE* use can also be
   controlled by -msse[12] (instead of march), but -mfpmath doesn't
   distinguish between SSE1 and SSE2, so there seems to be no way to
   use SSE2 generally and SSE1 for FP without also using SSE2 for FP.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120410111448.I1081>