Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 May 2003 11:04:17 -0700
From:      David Schultz <das@FreeBSD.ORG>
To:        Mikhail Teterin <Mikhail.Teterin@murex.com>
Cc:        freebsd-bugs@FreeBSD.ORG
Subject:   Re: bin/43299: march=pentium4 miscompiles msun/src/e_pow.c
Message-ID:  <20030513180417.GA4917@HAL9000.homeunix.com>
In-Reply-To: <200305131610.h4DGA8vx007871@freefall.freebsd.org>
References:  <200305131610.h4DGA8vx007871@freefall.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 13, 2003, Mikhail Teterin wrote:
> The following reply was made to PR bin/43299; it has been noted by GNATS.
> 
> From: Mikhail Teterin <Mikhail.Teterin@murex.com>
> To: freebsd-gnats-submit@FreeBSD.org
> Cc: bde@FreeBSD.org
> Subject: Re: bin/43299: march=pentium4 miscompiles msun/src/e_pow.c
> Date: Mon, 12 May 2003 14:26:35 -0400
> 
>  The problem with our libm (msun) vs. gcc-3 is reproduceable on Linux:
>  
>  	(Note, FreeBSD's /usr is mounted as /misha on the Linux machine)
>  
>  mteterin@nylinux:lib/msun/src (982) cc -v
>  Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/3.2/specs
>  Configured with: ../configure --prefix=/usr --mandir=/usr/share/man 
>  --infodir=/usr/share/info --enable-shared --enable-threads=posix 
>  --disable-checking --host=i386-redhat-linux --with-system-zlib 
>  --enable-__cxa_atexit
>  Thread model: posix
>  gcc version 3.2 20020903 (Red Hat Linux 8.0 3.2-7)
>  mteterin@nylinux:lib/msun/src (983) cc -O -march=pentium3 -I- -I. 
>  -I/misha/src/include -I/misha/src/sys -I/misha/obj/misha/src/i386/usr/include 
>  e_pow.c t.c e_sqrt.c -D__generic___ieee754_sqrt=__ieee754_sqrt
>  mteterin@nylinux:lib/msun/src (984) ./a.out                                     
>  2^2.1 is 4.28709
>  mteterin@nylinux:lib/msun/src (985) cc -O -march=pentium4 -I- -I. 
>  -I/misha/src/include -I/misha/src/sys -I/misha/obj/misha/src/i386/usr/include 
>  e_pow.c t.c e_sqrt.c -D__generic___ieee754_sqrt=__ieee754_sqrt
>  mteterin@nylinux:lib/msun/src (986) ./a.out                                     
>  2^2.1 is 0
>  
>  As can be seen above, using pentium3 produces the correct result, while
>  pentium4 produces the incorrect 0. We can, pretty much, rule out a kernel
>  problem in handling MMX/SSE. Which is it -- our __ieee754_pow or the gcc?

gcc is using SSE instructions in the Pentium 4 case:

[...]
-       fxch    %st(1)
-       fstpl   -56(%ebp)
-       movl    -56(%ebp), %ecx
-       movl    %ecx, -56(%ebp)
-       movl    %edi, -52(%ebp)
-       fldl    -56(%ebp)
+       movd    %edi, %xmm0
+       movsd   %xmm0, -64(%ebp)
+       fldl    -64(%ebp)
[...]

It's possible that gcc screwed something up wrt alignment or has
some sort of bug in the generation of SSE instructions.
__ieee754_pow() looks okay in that it doesn't seem to do anything
naughty with type punning, use uninitialized values, etc.  It
might be useful to construct a simpler test case so the specific
offending assembly can be identified.  Given that most of the
__ieee754_pow() code is special cases about NaNs and infinities,
it shouldn't be too hard to iteratively pare it down to the
line(s) that cause the discrepancy between the p3 and p4.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030513180417.GA4917>