Date: Sun, 23 Mar 2003 19:24:05 +0100 From: Till Riedel <till@f111.hadiko.de> To: freebsd-current@freebsd.org Subject: Re: libm problem Message-ID: <20030323182405.GA2135@f111.hadiko.de> In-Reply-To: <200303231843.16545.michaelnottebrock@gmx.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Mar 23, 2003 at 06:43:16PM +0100, Michael Nottebrock wrote: Content-Description: signed data > On Sunday 23 March 2003 18:02, Till Riedel wrote: > > why not > > +_CPUCFLAGS = -march=pentium4 -mno-sse2 > > > > > choose, and in the case of pentium4 producing broken code the > > > obvious fallback would be pentium3... > > > > above would be in fact the same because only the SSE2 code differs from > > march=pentium3 which in turn only defines SSE additionally (which > > probably generates the slower code compared to pentiumpro) as i see it. > > code generation for all x86 uses the same rules (i386.md) > > except that some rules only apply if TARGET_SSE2 is defined. I at least now know to some extend what make -mpentium4 slow. someone at gcc hacked a stupid cost table for its operations.This makes pentium4 fast again: *** i386.c Sun Mar 23 17:32:38 2003 --- i386.c.orig Sun Mar 23 17:45:35 2003 *************** *** 893,895 **** {"pentium3", PROCESSOR_PENTIUMPRO, PTA_MMX | PTA_SSE | PTA_PREFETCH_SSE}, ! {"pentium4", PROCESSOR_PENTIUMPRO, PTA_SSE | PTA_SSE2 | PTA_MMX | PTA_PREFETCH_SSE}, --- 893,895 ---- {"pentium3", PROCESSOR_PENTIUMPRO, PTA_MMX | PTA_SSE | PTA_PREFETCH_SSE}, ! {"pentium4", PROCESSOR_PENTIUM4, PTA_SSE | PTA_SSE2 | PTA_MMX | PTA_PREFETCH_SSE}, > > Just out of curiousity, have you tried using -mfpmath=sse? I remember someone > on this list claiming that the SSE fpa-code works much better than the i387 > code which is used by default (even with -march=pentium4). seems to be equally fast with whetstone benchmark , but makes sse2 slower because most sse2 rules depend on i387 math. here some results after the cost patch above: -march=pentiumpro whetstone took: 1.05 secs for 954 MFLOPS (w/ math lib) whetstone took: 0.28 secs for 3555 MFLOPS (w/o math lib) -march=pentium3 whetstone took: 1.05 secs for 954 MFLOPS (w/ math lib) whetstone took: 0.28 secs for 3556 MFLOPS (w/o math lib) -march=pentium3 -mfpmath=sse whetstone took: 1.05 secs for 953 MFLOPS (w/ math lib) whetstone took: 0.28 secs for 3555 MFLOPS (w/o math lib) -march=pentium4 whetstone took: 1.06 secs for 942 MFLOPS (w/ math lib) whetstone took: 0.29 secs for 3393 MFLOPS (w/o math lib) -march=pentium4 -mno-sse2 should after patch be the same as pentium3 whetstone took: 1.05 secs for 954 MFLOPS (w/ math lib) whetstone took: 0.28 secs for 3555 MFLOPS (w/o math lib) -march=pentium4 -mfpmath=sse whetstone took: 1.14 secs for 880 MFLOPS (w/ math lib) whetstone took: 0.36 secs for 2768 MFLOPS (w/o math lib) till To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030323182405.GA2135>