From owner-freebsd-stable@FreeBSD.ORG  Thu Sep 14 16:45:46 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@freebsd.org
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7B8EF16A403
	for <freebsd-stable@freebsd.org>; Thu, 14 Sep 2006 16:45:46 +0000 (UTC)
	(envelope-from cswiger@mac.com)
Received: from smtpout.mac.com (smtpout.mac.com [17.250.248.182])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 10F9F43D6B
	for <freebsd-stable@freebsd.org>; Thu, 14 Sep 2006 16:45:46 +0000 (GMT)
	(envelope-from cswiger@mac.com)
Received: from mac.com (smtpin08-en2 [10.13.10.153])
	by smtpout.mac.com (Xserve/8.12.11/smtpout12/MantshX 4.0) with ESMTP id
	k8EGjj2s017395; Thu, 14 Sep 2006 09:45:45 -0700 (PDT)
Received: from [17.214.13.96] (a17-214-13-96.apple.com [17.214.13.96])
	(authenticated bits=0)
	by mac.com (Xserve/smtpin08/MantshX 4.0) with ESMTP id k8EGjho9014519; 
	Thu, 14 Sep 2006 09:45:44 -0700 (PDT)
In-Reply-To: <20060914044241.GA92358@thought.org>
References: <200609130905.k8D95idk062789@lurza.secnetix.de>
	<4507CC9B.60704@sun-fish.com> <20060913234934.GA92067@thought.org>
	<0B8BF03E-8F4A-4279-850B-2EA7FF5E1B89@mac.com>
	<20060914044241.GA92358@thought.org>
Mime-Version: 1.0 (Apple Message framework v752.2)
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
Message-Id: <549CD0AC-2CED-4112-B708-5F4FB1DA69D2@mac.com>
Content-Transfer-Encoding: 7bit
From: Chuck Swiger <cswiger@mac.com>
Date: Thu, 14 Sep 2006 09:45:42 -0700
To: Gary Kline <kline@sage.thought.org>
X-Mailer: Apple Mail (2.752.2)
X-Brightmail-Tracker: AAAAAQAAA+k=
X-Language-Identified: TRUE
Cc: freebsd-stable <freebsd-stable@freebsd.org>
Subject: Re: optimization levels for 6-STABLE build{kernel,world}
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Sep 2006 16:45:46 -0000

On Sep 13, 2006, at 9:42 PM, Gary Kline wrote:
>> -funroll-loops is as likely to decrease performance for a particular
>> program as it is to help.
>
> 	Isn't the compiler intelligent enough to have a reasonable
> 	limit, N, of the loops it will unroll to ensure a faster runtime?
> 	Something much less than 1000, say; possibly less than 100.

Of course; in fact, N is probably closer to 4 or 8 than it is to 100.

> 	At least, if the initializiation and end-loop code *plus* the
> 	loop code itself were too large for the cache, my thought is that
> 	gcc would back out.

Unless you've indicated that the compiler should target a specific  
CPU architecture, there is no way for it to know whether the size of  
the L1 cache on the machine doing the compile is the same as, or even  
similar to the size of the system where the code will run.

>         I may be giving RMS too much credit; but
> 	if memory serves, thed compiler was GNU's first project.  And
> 	Stallman was into GOFAI, &c, for better/worse.[1]  Anyway, for now
> 	I'll comment out the unroll-loops arg.

cd /usr/src/contrib/gcc && grep Stallman ChangeLog

...returns no results.  A tool I wrote suggests:

% histogram.py -F'  ' -f 2,3 -p @ -c 10 ChangeLog
61 Kazu Hirata <kazu@cs.umass.edu>
51 Eric Botcazou <ebotcazou@libertysurf.fr>
48 Jan Hubicka <jh@suse.cz>
39 Richard Sandiford <rsandifo@redhat.com>
37 Alan Modra <amodra@bigpond.net.au>
30 Richard Henderson <rth@redhat.com>
29 Joseph S. Myers <jsm@polyomino.org.uk>
27 Jakub Jelinek <jakub@redhat.com>
25 Zack Weinberg <zack@codesourcery.com>
22 Mark Mitchell <mark@codesourcery.com>
20 John David Anglin <dave.anglin@nrc-cnrc.gc.ca>
20 Ulrich Weigand <uweigand@de.ibm.com>
17 Rainer Orth <ro@TechFak.Uni-Bielefeld.DE>
16 Kelley Cook <kcook@gcc.gnu.org>
16 Roger Sayle <roger@eyesopen.com>
13 David Edelsohn <edelsohn@gnu.org>
12 Aldy Hernandez <aldyh@redhat.com>
11 Stephane Carrez <stcarrez@nerim.fr>
11 Ian Lance Taylor <ian@wasabisystems.com>
10 Andrew Pinski <pinskia@physics.uc.edu>
10 Kaz Kojima <kkojima@gcc.gnu.org>
10 James E Wilson <wilson@specifixinc.com>


>> A safe optimizer must assume that an arbitrary assignment via a
>> pointer dereference can change any value in memory, which means that
>> you have to spill and reload any data being cached in CPU registers
>> around the use of the pointer, except for const's, variables declared
>> as "register", and possibly function arguments being passed via
>> registers and not on the stack (cf "register windows" on the SPARC
>> hardware, or HP/PA's calling conventions).
> 	
> 	Well, I'd added the no-strict-aliasing flag to make.conf!
> 	Pointers give me indigestion ... even after all these years.
> 	Thanks for your insights.  And the URL.

You're welcome.

> 	gary
>
> [1]. Seems to me that "good old-fashioned AI" techniques would work in
>      something like a compiler  where you probblyhave a good idea of
>      most heuristics.   -gk

Of course.  The compiler enables those optimizations with -O or -O2  
which are almost certain to result in beneficial improvements to  
performance and code size, most of the time.  Potential optimizations  
which are not helpful on average are not enabled by default, until  
the situations where they are known to be useful can be identified by  
the compiler at compile-time.

Using non-default optimization options isn't like discovering buried  
treasure that nobody else was aware of; the options aren't enabled by  
default for good reason(s), usually because the tradeoffs they make  
aren't helpful in general (yet), or because their usage has known  
bugs which result in faulty executables being produced.

-- 
-Chuck