From owner-freebsd-alpha@FreeBSD.ORG Mon Jan 10 16:07:54 2005 Return-Path: Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E8F9D16A4CE for ; Mon, 10 Jan 2005 16:07:54 +0000 (GMT) Received: from spacecat.mcgillsociety.org (adsl-216-158-26-62.cust.oldcity.dca.net [216.158.26.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2970D43D2F for ; Mon, 10 Jan 2005 16:07:54 +0000 (GMT) (envelope-from magill@mcgillsociety.org) Received: from [10.0.1.2] (abase.mcgillsociety.org [216.158.26.165]) j0AG5ol351954; Mon, 10 Jan 2005 11:05:50 -0500 (EST) In-Reply-To: <16866.32790.398095.651691@canoe.dclg.ca> References: <16866.32790.398095.651691@canoe.dclg.ca> Mime-Version: 1.0 (Apple Message framework v619) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: Content-Transfer-Encoding: 7bit X-Image-Url: http://www.mcgillsociety.org/magill.jpg From: "William H. Magill" Date: Mon, 10 Jan 2005 11:07:52 -0500 To: David Gilbert X-Mailer: Apple Mail (2.619) cc: freebsd-alpha@freebsd.org Subject: Re: processor type. X-BeenThere: freebsd-alpha@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting FreeBSD to the Alpha List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jan 2005 16:07:55 -0000 On 10 Jan, 2005, at 08:16, David Gilbert wrote: > I see in the compiler lines crawling by that gcc is asked to optimize > for 'EV5' while being compatible with 'EV4'. My Alpha is an EV4 --- > I'm wondering if I would see better performance with a different flag > there, but the gcc manual doesn't even acknowledge the existance of > the options that are in use, let alone the available options. I'm not a programmer type, but it was pretty well known that the GCC compiler generated from pretty poor, to downright bad code on the Alphas, no matter which one, when compared to the DEC C compiler (known as ccc on the Linux tools CD). However, I understand that the GCC compiler picked up (some?many?all) the Alpha optimization enhancements offered by Compaq shortly before the Intel/HP deal. And I do know from long TRU64 experience that optimizing for a particular EVx chip can make a big difference. The man page for the Dec C Compiler under 5.1A: Compaq C V6.3-028 on Compaq Tru64 UNIX V5.1 (Rev. 732) Compiler Driver V6.3-026 (sys) cc Driver states: (Note: There are two relevant options for ccc -arch and -tune.) -arch option Specifies which version of the Alpha architecture to generate instructions for. All Alpha processors implement a core set of instruc- tions and, in some cases, the following extensions: BWX (byte/word- manipulation extension), MVI (multimedia extension), FIX (square root and floating-point convert extension), and CIX (count extension). (The Alpha Architecture Reference Manual describes the extensions in detail.) The option specified by the -arch option determines which instructions the compiler can generate: generic Generate instructions that are appropriate for all Alpha proces- sors. This option is the default. host Generate instructions for the processor that the compiler is run- ning on (for example, EV6 instructions on an EV6 processor). ev4,ev5 Generate instructions for the EV4 processor (21064, 21064A, 21066, and 21068 chips) and EV5 processor (some 21164 chips). (Note that chip number 21164 is used for both EV5 and EV56 processors.) Applications compiled with this option will not incur any emulation overhead on any Alpha processor. ev56 Generate instructions for EV56 processors (some 21164 chips). This option permits the compiler to generate any EV4 instruction, plus any instructions contained in the BWX extension. Applications compiled with this option may incur emulation overhead on EV4 and EV5 processors. ev6 Generate instructions for EV6 processors (21264 chips). This option permits the compiler to generate any EV6 instruction, plus any instructions contained in the following extensions: BWX, MVI, and FIX. Applications compiled with this option may incur emulation overhead on EV4, EV5, EV56, and PCA56 processors. ev67 Generate instructions for EV67 processors (21264A chips). This option is the same as the ev6 option except that it also per- mits the compiler to generate any instructions contained in the CIX extension. If your application uses CIX instructions, it may incur emulation overhead on all processors that are older than EV67. pca56 Generate instructions for PCA56 processors (21164PC chips). This option permits the compiler to generate any EV4 instruction, plus any instructions contained in the BWX and MVI extensions. Applications compiled with this option may incur emulation overhead on EV4, EV5, and EV56 processors. A program compiled with any of the options will run on any Alpha pro- cessor. Beginning with DIGITAL UNIX V4.0 and continuing with subse- quent versions, the operating system kernel includes an instruction emulator. This capability allows any Alpha chip to execute and produce correct results from Alpha instructions--even if the some of the instructions are not implemented on the chip. Applications using emu- lated instructions will run correctly, but may incur significant emula- tion overhead at run time. The psrinfo -v command can be used to determine which type of processor is installed on any given Alpha system. Note the following differences between the -arch evx and -tune evx options (where x designates a specific processor): + -arch evx implies -tune evx, but -tune evx does not imply -arch evx. + -arch evx can generate unguarded evx-specific instructions. If you run that application on a pre-evx processor, those instruc- tions may get emulated (and emulated instructions can be up to 1000 times slower than actual instructions). + -tune evx can generate evx-specific instructions, but those are always amask-guarded. That expands the code size but avoids instruction emulation. + If you want the best performance possible on an evx processor and are not concerned about performance on earlier processors, the best choice would be -arch evx (which implies -tune evx). + If you want good performance on an evx processor but also want the application to run reasonably fast on earlier processors, the best choice would probably be -tune evx. =============== -tune option Instructs the optimizer to tune the application for a specific version of the Alpha hardware. This will not prevent the application from run- ning correctly on other versions of Alpha but it may run more slowly than generically-tuned code on those versions. The option argument can be one of the following, which selects instruc- tion tuning appropriate for the listed processor(s): generic All Alpha processors. This is the default. host The processor on which the code is compiled. ev4 The 21064, 21064A, and 21068 processors. ev5,ev56 The 21164 processor. (Both EV5 and EV56 are numbered 21164.) ev6 The 21264 processor. ev67 The 21264A processor. See also the -arch option for an explanation of the differences between -tune and -arch. T.T.F.N. William H. Magill # Beige G3 [Rev A motherboard - 300 MHz 768 Meg] OS X 10.2.8 # Flat-panel iMac (2.1) [800MHz - Super Drive - 768 Meg] OS X 10.3.7 # PWS433a [Alpha 21164 Rev 7.2 (EV56)- 64 Meg] Tru64 5.1a # XP1000 [Alpha 21264-3 (EV6) - 256 meg] FreeBSD 5.3 # XP1000 [Alpha 21264-A (EV 6.7) - 384 meg] FreeBSD 5.3 magill@mcgillsociety.org magill@acm.org magill@mac.com whmagill@gmail.com