Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Feb 2004 11:25:56 +0200
From:      Petri Helenius <pete@he.iki.fi>
To:        Peter Jeremy <peter.jeremy@alcatel.com.au>
Cc:        freebsd-alpha@freebsd.org
Subject:   Re: Bad performance on alpha? (make buildworld)
Message-ID:  <403C6A24.80804@he.iki.fi>
In-Reply-To: <20040225025953.GH10121@gsmx07.alcatel.com.au>
References:  <20040223192103.59ad7b69.lehmann@ans-netz.de> <20040224202652.GA13675@diogenis.ceid.upatras.gr> <5410C982-6730-11D8-8D4C-003065ABFD92@mac.com> <20040225025953.GH10121@gsmx07.alcatel.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
Peter Jeremy wrote:

>Recent iA32 implementations (basically anything more recent than a
>PII) are RISC cores which directly execute a subset of the iA32
>instruction set with the remainder handled by microcode.  You get
>quite respectable results by treating it as a load/store RISC
>architecture and relying on the L1 cache to handle the register spills
>  
>
This probably invites the question, what, if anything people like me who 
are interested in getting the maximum performance out of any hardware 
our things run on (maybe with the exception of the low-MHz embedded 
stuff :-), is there any good tutorials/books on the subject what kind of 
things to avoid when looking for optimal performance. The tightest loops 
mostly do counter rolling, comparisons and pattern matching and we have 
good mileage on getting performance gains by minimizing writing to 
memory when there are other options like arithmetic on the fly.

One specific question that also comes to mind is if there is benefit on 
the more modern, SSE enabled code, to excersise floating point in 
balance with 64bit long long integers or does that gain performance only 
if the code is compiled without SSE?

Pete



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?403C6A24.80804>