Date: Tue, 22 Jul 2003 09:30:07 -0700 From: David Schultz <das@FreeBSD.ORG> To: Poul-Henning Kamp <phk@phk.freebsd.dk> Cc: Bruce Evans <bde@zeta.org.au> Subject: Re: cvs commit: src/sys/dev/lnc if_lnc.c Message-ID: <20030722163007.GA6080@HAL9000.homeunix.com> In-Reply-To: <11951.1058884091@critter.freebsd.dk> References: <20030722235600.X8165@gamplex.bde.org> <11951.1058884091@critter.freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jul 22, 2003, Poul-Henning Kamp wrote: > In message <20030722235600.X8165@gamplex.bde.org>, Bruce Evans writes: > > >Several places, including if_lnc.c, used __inline to get cleaner code > >at no cost in performance. Removing __inline adds a tiny cost. > > You know, I would have agreed with you on that, if we were talking > about a CPU with no caches, no branch-prediction and no prefetching. > > For modern CPUs however it would be hard to prove the above statement > true or false with any sort of measurement setup we have access to, > and it would be even harder if not downright impossible to prove > that the result was generally applicable to a majority of the current > hardware on the market. gcc is a bad model to follow here. It disables inlining to salvage compile time where its register allocation algorithms don't scale. And since it's disabling the inlining all by itself, there's no need to bend over backwards to appease it. The cost of not inlining is insignificant on i386, but it can be very significant on architectures with sliding register windows, such as sparc64. On sparc64, a trap is taken every time the procedure call depth changes by more than 6. An inlining can mean the difference between super-efficient procedure calls and having to slide the register window back and forth at great penalty. This problem wouldn't be so bad if we had a production-quality sparc64 compiler that did aggressive tail-call elimination, but for the moment we're stuck with gcc. Another argument for inlining is that there are some very large functions where inlining the function allows the compiler to optimize away most of it after constant propagation. The savings can be substantial if you can optimize a test in an inner loop. For instance, vm_object_backing_scan() takes a mode flag that is a compile-time constant. (In this particular case, however, there may be a significant code size difference that we don't want to pay for, since the function is used more than once.) > I would _really_ love to nail up a policy note which says that "inline" > should only be used if it is possible to show an effect, either because > inlining reduces the code size (in this particular case less bytes to > execute is generally an indication of better performance) or by > showing actual performance improvements from the inlining. There is reason for concern about cases where inline really is misused, either because it massively increases code size or because it is unimportant to performance and detracts from debuggability. But I would not like to see a policy that shifts the burden of proof onto authors of new code.[1] A policy about gratuitous sweeps through other people's code, on the other hand... [1] In practice, just about any contentious case of inlining is going to be a wash anyway, and neither side of the argument is entirely without merit. I'm mostly opposed to a new policy on the grounds that it's just another stupid rule, complete with technicolor bikesheds, to throw in the faces of people trying to do something useful.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030722163007.GA6080>