From owner-freebsd-smp Thu May 30 13:28:50 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id NAA20535 for smp-outgoing; Thu, 30 May 1996 13:28:50 -0700 (PDT) Received: from tfs.com (tfs.com [140.145.250.1]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id NAA20521 for ; Thu, 30 May 1996 13:28:45 -0700 (PDT) Received: from critter.tfs.com by tfs.com (smail3.1.28.1) with SMTP id m0uPEKq-0003wnC; Thu, 30 May 96 13:28 PDT Received: from critter.tfs.com (localhost [127.0.0.1]) by critter.tfs.com (8.7.5/8.6.12) with ESMTP id UAA00779; Thu, 30 May 1996 20:28:42 GMT To: erich@uruk.org cc: freebsd-smp@freebsd.org Subject: Re: How do you get the SMP code In-reply-to: Your message of "Thu, 30 May 1996 12:05:31 MST." <199605301905.MAA07776@uruk.org> Date: Thu, 30 May 1996 20:28:41 +0000 Message-ID: <777.833488121@critter.tfs.com> From: Poul-Henning Kamp Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > You have to do at least one level of pointer indirection for indexing > which CPU you're using anyway, so this is mostly a redesign of how the > mapping works. I agree mostly to this, I just want to make sure that we don't overengineer it. > > Nasty question time: Can part of the local apic registers be used > > as "per-cpu registers" without too much performance penalty ? > > It is particular to each processor. On the Pentium and Pentium Pro, > accesses are part of the external bus and L2 DCU path respectively. > I.e. I think it is slower than the L1. but faster than doing too much aritmetic on the apic_id. My point being that any attempt to find the per-cpu data starts out with trying to read the APIC_ID, so we might as well cache a pointer in the APIC and read that instead... I thought about the fact that we don't use %[gf]s in the kernel quite a bit, one could make a segment per cpu and have the CPU's differ only in %gs's contents. That way we just need to set %gs on entry to the kernel (in trap/syscall/irq &c) and everything is (moderate) downhill from there, with the footnote that we have no way of explaining to CC that it should use the "gs:" prefix, so a lot of ugly inline assembler is needed for it. > > I would really love to have one or two 32bit registers local per CPU > > to speed up all this stuff... > > Yes, yes. We've heard that quite a bit. Oh, and while your're at it: add a nano-second clock, it doesn't have to have nano-sec increments, just units of nano-secs. And if you have more space on your silicon, we have more ideas as well :-) > FWIW: GCC has a lot of room for improvement... the "Pentium GCC" work > showed that just taking advantage of some x86-isms can get you > BOTH 10-30% denser code that's also 10-30% faster. Oh, sure! But performance by design is even better :-) If I can design it so that I only have one level of indirection, then any version of any compiler will do better :-) -- Poul-Henning Kamp | phk@FreeBSD.ORG FreeBSD Core-team. http://www.freebsd.org/~phk | phk@login.dknet.dk Private mailbox. whois: [PHK] | phk@ref.tfs.com TRW Financial Systems, Inc. Future will arrive by its own means, progress not so.