From owner-freebsd-current Wed Jul 3 3:35:22 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 982F137B400 for ; Wed, 3 Jul 2002 03:35:19 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8B00B43E52 for ; Wed, 3 Jul 2002 03:35:18 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id UAA29094; Wed, 3 Jul 2002 20:35:06 +1000 Date: Wed, 3 Jul 2002 20:41:03 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Garance A Drosihn Cc: Matthew Dillon , "David O'Brien" , FreeBSD current users Subject: Re: -current results (was something funny with soft updates?) In-Reply-To: Message-ID: <20020703201421.B15898-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 3 Jul 2002, Garance A Drosihn wrote: > At 11:01 PM -0700 7/2/02, Matthew Dillon wrote: > > I get just about the same performance for GCC2 as I > > do for GCC3 in the tests I've run so far. It makes > > me wonder what the hell GCC3 is burning all that > > cpu *on*. > > One of the guys here at RPI (dec, actually) claims he got > buildworld under current to run at more reasonable speeds > by explicitly setting the CPUTYPE. I haven't had the time > to run any experiments with that yet. I got some improvements in generated code for a microbenchmark by compiling with -march=. gcc on i386's now likes to "optimize" "andb $1,%al" and "testb $1,%al" as "andl $1,%eax" and "testl $1,%eax", respectively. This tends to give a large pessimization (50% for the above in a loop) on at least PentiumPro's and PII's due to a partal register stall. Compiling with -march=pentium2 regains the original speed on a Celeron400 at least by zero-extending %eax before using it, but double-crosses itself by going back to using %al and not actually using %eax. Manually changing the code back to use %eax gave a 5% speedup for the loop relative to the old version. The manual change also gave a 5% speedup for an AthlonXP. AthlonXP's don't have partial register stalls and all versions generated by gcc gave the same results (-march=athlon-xp generated the same code as -march=pentium2). Summary: we can break even on all tested arches with gcc-3 for the microbenchmark by setting CPUTYPE right. We can beat gcc-2 by tweaking the generated code to be what gcc-3 apparently intended. But I don't like setting CPUTYPE or use -march, since I want to run the same code on different (i386-sub-)arches. I have 2 different ones on active machines and 3 more on inactive machines). Releases need to run on even more arches. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message