Date: Tue, 13 May 2003 11:04:17 -0700 From: David Schultz <das@FreeBSD.ORG> To: Mikhail Teterin <Mikhail.Teterin@murex.com> Cc: freebsd-bugs@FreeBSD.ORG Subject: Re: bin/43299: march=pentium4 miscompiles msun/src/e_pow.c Message-ID: <20030513180417.GA4917@HAL9000.homeunix.com> In-Reply-To: <200305131610.h4DGA8vx007871@freefall.freebsd.org>
index | next in thread | previous in thread | raw e-mail
On Tue, May 13, 2003, Mikhail Teterin wrote: > The following reply was made to PR bin/43299; it has been noted by GNATS. > > From: Mikhail Teterin <Mikhail.Teterin@murex.com> > To: freebsd-gnats-submit@FreeBSD.org > Cc: bde@FreeBSD.org > Subject: Re: bin/43299: march=pentium4 miscompiles msun/src/e_pow.c > Date: Mon, 12 May 2003 14:26:35 -0400 > > The problem with our libm (msun) vs. gcc-3 is reproduceable on Linux: > > (Note, FreeBSD's /usr is mounted as /misha on the Linux machine) > > mteterin@nylinux:lib/msun/src (982) cc -v > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/3.2/specs > Configured with: ../configure --prefix=/usr --mandir=/usr/share/man > --infodir=/usr/share/info --enable-shared --enable-threads=posix > --disable-checking --host=i386-redhat-linux --with-system-zlib > --enable-__cxa_atexit > Thread model: posix > gcc version 3.2 20020903 (Red Hat Linux 8.0 3.2-7) > mteterin@nylinux:lib/msun/src (983) cc -O -march=pentium3 -I- -I. > -I/misha/src/include -I/misha/src/sys -I/misha/obj/misha/src/i386/usr/include > e_pow.c t.c e_sqrt.c -D__generic___ieee754_sqrt=__ieee754_sqrt > mteterin@nylinux:lib/msun/src (984) ./a.out > 2^2.1 is 4.28709 > mteterin@nylinux:lib/msun/src (985) cc -O -march=pentium4 -I- -I. > -I/misha/src/include -I/misha/src/sys -I/misha/obj/misha/src/i386/usr/include > e_pow.c t.c e_sqrt.c -D__generic___ieee754_sqrt=__ieee754_sqrt > mteterin@nylinux:lib/msun/src (986) ./a.out > 2^2.1 is 0 > > As can be seen above, using pentium3 produces the correct result, while > pentium4 produces the incorrect 0. We can, pretty much, rule out a kernel > problem in handling MMX/SSE. Which is it -- our __ieee754_pow or the gcc? gcc is using SSE instructions in the Pentium 4 case: [...] - fxch %st(1) - fstpl -56(%ebp) - movl -56(%ebp), %ecx - movl %ecx, -56(%ebp) - movl %edi, -52(%ebp) - fldl -56(%ebp) + movd %edi, %xmm0 + movsd %xmm0, -64(%ebp) + fldl -64(%ebp) [...] It's possible that gcc screwed something up wrt alignment or has some sort of bug in the generation of SSE instructions. __ieee754_pow() looks okay in that it doesn't seem to do anything naughty with type punning, use uninitialized values, etc. It might be useful to construct a simpler test case so the specific offending assembly can be identified. Given that most of the __ieee754_pow() code is special cases about NaNs and infinities, it shouldn't be too hard to iteratively pare it down to the line(s) that cause the discrepancy between the p3 and p4.home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030513180417.GA4917>
