Date: Sun, 23 Mar 2003 19:24:05 +0100 From: Till Riedel <till@f111.hadiko.de> To: freebsd-current@freebsd.org Subject: Re: libm problem Message-ID: <20030323182405.GA2135@f111.hadiko.de> In-Reply-To: <200303231843.16545.michaelnottebrock@gmx.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Mar 23, 2003 at 06:43:16PM +0100, Michael Nottebrock wrote:
Content-Description: signed data
> On Sunday 23 March 2003 18:02, Till Riedel wrote:
> > why not
> > +_CPUCFLAGS = -march=pentium4 -mno-sse2
> >
> > > choose, and in the case of pentium4 producing broken code the
> > > obvious fallback would be pentium3...
> >
> > above would be in fact the same because only the SSE2 code differs from
> > march=pentium3 which in turn only defines SSE additionally (which
> > probably generates the slower code compared to pentiumpro) as i see it.
> > code generation for all x86 uses the same rules (i386.md)
> > except that some rules only apply if TARGET_SSE2 is defined.
I at least now know to some extend what make -mpentium4 slow. someone at
gcc hacked a stupid cost table for its operations.This makes pentium4
fast again:
*** i386.c Sun Mar 23 17:32:38 2003
--- i386.c.orig Sun Mar 23 17:45:35 2003
***************
*** 893,895 ****
{"pentium3", PROCESSOR_PENTIUMPRO, PTA_MMX | PTA_SSE | PTA_PREFETCH_SSE},
! {"pentium4", PROCESSOR_PENTIUMPRO, PTA_SSE | PTA_SSE2 |
PTA_MMX | PTA_PREFETCH_SSE},
--- 893,895 ----
{"pentium3", PROCESSOR_PENTIUMPRO, PTA_MMX | PTA_SSE | PTA_PREFETCH_SSE},
! {"pentium4", PROCESSOR_PENTIUM4, PTA_SSE | PTA_SSE2 | PTA_MMX | PTA_PREFETCH_SSE},
>
> Just out of curiousity, have you tried using -mfpmath=sse? I remember someone
> on this list claiming that the SSE fpa-code works much better than the i387
> code which is used by default (even with -march=pentium4).
seems to be equally fast with whetstone benchmark ,
but makes sse2 slower because most sse2 rules depend on i387 math.
here some results after the cost patch above:
-march=pentiumpro
whetstone took: 1.05 secs for 954 MFLOPS (w/ math lib)
whetstone took: 0.28 secs for 3555 MFLOPS (w/o math lib)
-march=pentium3
whetstone took: 1.05 secs for 954 MFLOPS (w/ math lib)
whetstone took: 0.28 secs for 3556 MFLOPS (w/o math lib)
-march=pentium3 -mfpmath=sse
whetstone took: 1.05 secs for 953 MFLOPS (w/ math lib)
whetstone took: 0.28 secs for 3555 MFLOPS (w/o math lib)
-march=pentium4
whetstone took: 1.06 secs for 942 MFLOPS (w/ math lib)
whetstone took: 0.29 secs for 3393 MFLOPS (w/o math lib)
-march=pentium4 -mno-sse2 should after patch be the same as pentium3
whetstone took: 1.05 secs for 954 MFLOPS (w/ math lib)
whetstone took: 0.28 secs for 3555 MFLOPS (w/o math lib)
-march=pentium4 -mfpmath=sse
whetstone took: 1.14 secs for 880 MFLOPS (w/ math lib)
whetstone took: 0.36 secs for 2768 MFLOPS (w/o math lib)
till
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030323182405.GA2135>
