Date: Wed, 25 Feb 2004 11:25:56 +0200 From: Petri Helenius <pete@he.iki.fi> To: Peter Jeremy <peter.jeremy@alcatel.com.au> Cc: freebsd-alpha@freebsd.org Subject: Re: Bad performance on alpha? (make buildworld) Message-ID: <403C6A24.80804@he.iki.fi> In-Reply-To: <20040225025953.GH10121@gsmx07.alcatel.com.au> References: <20040223192103.59ad7b69.lehmann@ans-netz.de> <20040224202652.GA13675@diogenis.ceid.upatras.gr> <5410C982-6730-11D8-8D4C-003065ABFD92@mac.com> <20040225025953.GH10121@gsmx07.alcatel.com.au>
next in thread | previous in thread | raw e-mail | index | archive | help
Peter Jeremy wrote: >Recent iA32 implementations (basically anything more recent than a >PII) are RISC cores which directly execute a subset of the iA32 >instruction set with the remainder handled by microcode. You get >quite respectable results by treating it as a load/store RISC >architecture and relying on the L1 cache to handle the register spills > > This probably invites the question, what, if anything people like me who are interested in getting the maximum performance out of any hardware our things run on (maybe with the exception of the low-MHz embedded stuff :-), is there any good tutorials/books on the subject what kind of things to avoid when looking for optimal performance. The tightest loops mostly do counter rolling, comparisons and pattern matching and we have good mileage on getting performance gains by minimizing writing to memory when there are other options like arithmetic on the fly. One specific question that also comes to mind is if there is benefit on the more modern, SSE enabled code, to excersise floating point in balance with 64bit long long integers or does that gain performance only if the code is compiled without SSE? Pete
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?403C6A24.80804>