From owner-freebsd-current Sat Dec 2 21:06:43 1995 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id VAA19728 for current-outgoing; Sat, 2 Dec 1995 21:06:43 -0800 Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.2.228.19]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id VAA19715 for ; Sat, 2 Dec 1995 21:06:25 -0800 Received: (from bde@localhost) by godzilla.zeta.org.au (8.6.9/8.6.9) id QAA09482; Sun, 3 Dec 1995 16:05:43 +1100 Date: Sun, 3 Dec 1995 16:05:43 +1100 From: Bruce Evans Message-Id: <199512030505.QAA09482@godzilla.zeta.org.au> To: scrappy@hub.org, wollman@lcs.mit.edu Subject: Re: -O6/-fstrength-reduce for kernel (Was: Re: changes in -current...) Cc: current@FreeBSD.ORG Sender: owner-current@FreeBSD.ORG Precedence: bulk >> Also, someone mentioned using -fno-strength-reduce? If I >> used that, with -O6, would I notice any benefits, or does using >> -fno-strength-reduce just about take out any benefits to -O6? >Your Mileage May Vary. >In the testing I was doing last month on packet send and forwarding >rates, I found that using any sort of optimization beyond the default >`-O' resulted in a repeatable 1-3% /decrease/ in maximum packet rate. >The same was true of `-m486'. I'll be interested to see the results >if the Pentium enhancements are ever integrated into the main gcc >line. >I suspect there may be some funny cache effects going on, but it's >next to impossible to tell. The 1%-3% decrease for -O2 is quite likely to be due to bogus strength reductions. Strength reduction usually requires more pseudo-registers. Sometimes (more often on i*86's because there aren't enough real registers to begin with) the register allocator can't cope. And because the i*86 has a fancy scaled index address mode, one common strength reduction (for array indices) would at best decrease the speed by a tiny amount. The decrease for -m486 is probably due to cache effects. About half of the "optimizations" for -m486 are for alignment in the text section. Such alignment mainly wastes the cache for Pentiums. (In the worst case (usually not in loops) it can introduce up to 15 nop's that get executed.) gcc-2.7 has -malign-* options that allow you to control the alignment. I would be interested in seeing the results if i*86 enhancements are ever written for the main gcc line :-). Bruce