From owner-freebsd-hackers@FreeBSD.ORG Tue Jun 26 20:25:29 2007 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D5AE916A41F; Tue, 26 Jun 2007 20:25:29 +0000 (UTC) (envelope-from cracauer@koef.zs64.net) Received: from koef.zs64.net (koef.zs64.net [212.12.50.230]) by mx1.freebsd.org (Postfix) with ESMTP id 65DB913C46A; Tue, 26 Jun 2007 20:25:29 +0000 (UTC) (envelope-from cracauer@koef.zs64.net) Received: from koef.zs64.net (koef.zs64.net [212.12.50.230]) by koef.zs64.net (8.14.1/8.14.1) with ESMTP id l5QJoVYE029859; Tue, 26 Jun 2007 21:50:31 +0200 (CEST) (envelope-from cracauer@koef.zs64.net) Received: (from cracauer@localhost) by koef.zs64.net (8.14.1/8.14.1/Submit) id l5QJoVSB029858; Tue, 26 Jun 2007 15:50:31 -0400 (EDT) (envelope-from cracauer) Date: Tue, 26 Jun 2007 15:50:31 -0400 From: Martin Cracauer To: Martin Turgeon Message-ID: <20070626195031.GA29545@cons.org> References: <467EFF06.6020902@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <467EFF06.6020902@gmail.com> User-Agent: Mutt/1.4.2.2i Cc: freebsd-hackers@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: Which CPUTYPE for a dualcore Xeon on AMD64 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jun 2007 20:25:29 -0000 Martin Turgeon wrote on Sun, Jun 24, 2007 at 07:32:22PM -0400: > Hi, > > I recently installed AMD64 6.2 Release on 2 PowerEdge servers, both with > dual core Xeon (3070 and 5110). I extensively benchmarked different compiler options on Xeon 5160 (3.0 GHz Core2) with gcc-4.1.2 and gcc-4.2. Apart from very minor differences the best was plain "-O3 -finline-limit=xxx" where xxx was different by code, some code ran faster with 400 and other code with 750 (both beating the 600 default). The inline limit made a bigger difference than most of the other options and I actually ended up compiling parts of my code with a differen inline-limit than others. The result was within a percent of all highly tuned CPU-specific options like -march=k8 -msse3 -mfpmath=sse -ffast-math, and I went through most iterations. This means that locking your code to one x86_64 implementation and locking out either AMD or Intel is not worth the trouble. Testing was done on gcc-4.2.1 and later partially verified with gcc-4.2. Gcc-4.2 was a little slower overall but the same options were about the same speed. I also tested with Intel's icc 9.0 which didn't even come close to either gcc, even if you were willing to wait 10 times as long for compilation to finish (for inter-object file optimizations). No inlining limit would bring Intel's icc code size down to close what gcc had and subsequently performance was bad. gcc-3.4 was blown out of the water by gcc-4, too. Martin -- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Martin Cracauer http://www.cons.org/cracauer/ FreeBSD - where you want to go, today. http://www.freebsd.org/