FreeBSD Mail Archives

Date:      Tue, 22 Jul 2003 09:30:07 -0700
From:      David Schultz <das@FreeBSD.ORG>
To:        Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc:        Bruce Evans <bde@zeta.org.au>
Subject:   Re: cvs commit: src/sys/dev/lnc if_lnc.c
Message-ID:  <20030722163007.GA6080@HAL9000.homeunix.com>
In-Reply-To: <11951.1058884091@critter.freebsd.dk>
References:  <20030722235600.X8165@gamplex.bde.org> <11951.1058884091@critter.freebsd.dk>

On Tue, Jul 22, 2003, Poul-Henning Kamp wrote:
> In message <20030722235600.X8165@gamplex.bde.org>, Bruce Evans writes:
> 
> >Several places, including if_lnc.c, used __inline to get cleaner code
> >at no cost in performance.  Removing __inline adds a tiny cost.
> 
> You know, I would have agreed with you on that, if we were talking
> about a CPU with no caches, no branch-prediction and no prefetching.
> 
> For modern CPUs however it would be hard to prove the above statement
> true or false with any sort of measurement setup we have access to,
> and it would be even harder if not downright impossible to prove
> that the result was generally applicable to a majority of the current
> hardware on the market.

gcc is a bad model to follow here.  It disables inlining to
salvage compile time where its register allocation algorithms
don't scale.  And since it's disabling the inlining all by itself,
there's no need to bend over backwards to appease it.

The cost of not inlining is insignificant on i386, but it can be
very significant on architectures with sliding register windows,
such as sparc64.  On sparc64, a trap is taken every time the
procedure call depth changes by more than 6.  An inlining can mean
the difference between super-efficient procedure calls and having
to slide the register window back and forth at great penalty.
This problem wouldn't be so bad if we had a production-quality
sparc64 compiler that did aggressive tail-call elimination, but for
the moment we're stuck with gcc.

Another argument for inlining is that there are some very large
functions where inlining the function allows the compiler to
optimize away most of it after constant propagation.  The savings
can be substantial if you can optimize a test in an inner loop.
For instance, vm_object_backing_scan() takes a mode flag that is a
compile-time constant.  (In this particular case, however, there
may be a significant code size difference that we don't want to
pay for, since the function is used more than once.)

> I would _really_ love to nail up a policy note which says that "inline"
> should only be used if it is possible to show an effect, either because
> inlining reduces the code size (in this particular case less bytes to
> execute is generally an indication of better performance) or by
> showing actual performance improvements from the inlining.

There is reason for concern about cases where inline really is
misused, either because it massively increases code size or
because it is unimportant to performance and detracts from
debuggability.  But I would not like to see a policy that shifts
the burden of proof onto authors of new code.[1]  A policy about
gratuitous sweeps through other people's code, on the other
hand...

[1] In practice, just about any contentious case of inlining is
    going to be a wash anyway, and neither side of the argument
    is entirely without merit.  I'm mostly opposed to a new policy
    on the grounds that it's just another stupid rule, complete
    with technicolor bikesheds, to throw in the faces of people
    trying to do something useful.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030722163007.GA6080>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation