Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Jul 1996 18:20:43 -0500 (EST)
From:      "John S. Dyson" <toor@dyson.iquest.net>
To:        bde@zeta.org.au (Bruce Evans)
Cc:        sos@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG, joerg_wunsch@uriah.heep.sax.de, jonny@gaia.coppe.ufrj.br, pjchilds@imforei.apana.org.au
Subject:   Re: Preach it (was Some recent changes to GENERIC)
Message-ID:  <199607142320.SAA04064@dyson.iquest.net>
In-Reply-To: <199607142231.IAA08296@godzilla.zeta.org.au> from "Bruce Evans" at Jul 15, 96 08:31:00 am

next in thread | previous in thread | raw e-mail | index | archive | help
> 
> >If I remove all __inlines in pmap.c, I can save about 3K.  Maybe we
> >should/shoundn't have a "SMALL_KERNEL" option?
> 
> It would probably be faster too.  Large inlines are often slower because
> they bust caches.  3K is too much for the 8K combined I&D L1 cache on
> 486's - if the 3K is all executed often then it busts the cache, and if
> it isn't all executed often then inlining (all of) it just wastes space
> when it isn't executed and depletes the caches when it is executed (if a
> function version if it would be in a cache).
> 
I usually make guesses as to the applicability of __inline, and then
benchmark to check performance.  Sometimes, I look at the generated
code to make sure it doesn't produce gross or hugely expanded code.
My benchmarks are done on a P5-166, so of course my results are not
directly applicable to a 486.  For highest performance, 486 isn't the
answer anymore anyway.

However, the performance improvment of careful inlining is about 5%
at best (in pmap) using lmbench lat_proc.  But, with Linux'ers looking
at that kind of difference to distinguish the OSes, I believe that we
should be careful to squeeze where we can.  Feel free to suggest
otherwise...  I really don't mind guidelines, as well as they are
well thought out.

John



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199607142320.SAA04064>