From owner-freebsd-hackers Sun Jul 14 16:20:59 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id QAA09153 for hackers-outgoing; Sun, 14 Jul 1996 16:20:59 -0700 (PDT) Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id QAA09146; Sun, 14 Jul 1996 16:20:56 -0700 (PDT) Received: (from root@localhost) by dyson.iquest.net (8.7.5/8.6.9) id SAA04064; Sun, 14 Jul 1996 18:20:43 -0500 (EST) From: "John S. Dyson" Message-Id: <199607142320.SAA04064@dyson.iquest.net> Subject: Re: Preach it (was Some recent changes to GENERIC) To: bde@zeta.org.au (Bruce Evans) Date: Sun, 14 Jul 1996 18:20:43 -0500 (EST) Cc: sos@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG, joerg_wunsch@uriah.heep.sax.de, jonny@gaia.coppe.ufrj.br, pjchilds@imforei.apana.org.au In-Reply-To: <199607142231.IAA08296@godzilla.zeta.org.au> from "Bruce Evans" at Jul 15, 96 08:31:00 am X-Mailer: ELM [version 2.4 PL24 ME8] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > >If I remove all __inlines in pmap.c, I can save about 3K. Maybe we > >should/shoundn't have a "SMALL_KERNEL" option? > > It would probably be faster too. Large inlines are often slower because > they bust caches. 3K is too much for the 8K combined I&D L1 cache on > 486's - if the 3K is all executed often then it busts the cache, and if > it isn't all executed often then inlining (all of) it just wastes space > when it isn't executed and depletes the caches when it is executed (if a > function version if it would be in a cache). > I usually make guesses as to the applicability of __inline, and then benchmark to check performance. Sometimes, I look at the generated code to make sure it doesn't produce gross or hugely expanded code. My benchmarks are done on a P5-166, so of course my results are not directly applicable to a 486. For highest performance, 486 isn't the answer anymore anyway. However, the performance improvment of careful inlining is about 5% at best (in pmap) using lmbench lat_proc. But, with Linux'ers looking at that kind of difference to distinguish the OSes, I believe that we should be careful to squeeze where we can. Feel free to suggest otherwise... I really don't mind guidelines, as well as they are well thought out. John