Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Jul 1996 10:44:46 +1000
From:      Bruce Evans <bde@zeta.org.au>
To:        bde@zeta.org.au, toor@dyson.iquest.net
Cc:        freebsd-hackers@FreeBSD.ORG, joerg_wunsch@uriah.heep.sax.de, jonny@gaia.coppe.ufrj.br, pjchilds@imforei.apana.org.au, sos@FreeBSD.ORG
Subject:   Re: Preach it (was Some recent changes to GENERIC)
Message-ID:  <199607150044.KAA12373@godzilla.zeta.org.au>

next in thread | raw e-mail | index | archive | help
>> >If I remove all __inlines in pmap.c, I can save about 3K.  Maybe we
>> >should/shoundn't have a "SMALL_KERNEL" option?
>> 
>> It would probably be faster too.  Large inlines are often slower because
>> they bust caches.  3K is too much for the 8K combined I&D L1 cache on
>> ...
>I usually make guesses as to the applicability of __inline, and then
>benchmark to check performance.  Sometimes, I look at the generated

Remember, it's very hard to benchmark.  Benchmarks tend to test the
unloaded case and caches work better in the unloaded case :-).

>However, the performance improvment of careful inlining is about 5%
>at best (in pmap) using lmbench lat_proc.  But, with Linux'ers looking
>at that kind of difference to distinguish the OSes, I believe that we
>should be careful to squeeze where we can.  Feel free to suggest
>otherwise...  I really don't mind guidelines, as well as they are
>well thought out.

A simple guidline: don't inline anything that makes a function call,
perhaps even to an inline function.  pmap.c more or less follows this
rule except it calls nested inline functions a lot.  Most of the 3K
seems to be caused by nesting.  Perhaps more functions should be
split up like pmap_allocpte() (to have a small usual case and call
a non-inline function for special cases).

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199607150044.KAA12373>