From owner-freebsd-smp Thu May 30 16:27:02 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id QAA14693 for smp-outgoing; Thu, 30 May 1996 16:27:02 -0700 (PDT) Received: from uruk.org (uruk.org [198.145.95.253]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id QAA14681 for ; Thu, 30 May 1996 16:26:52 -0700 (PDT) From: erich@uruk.org Received: from loopback (loopback [127.0.0.1]) by uruk.org (8.7.4/8.7.3) with SMTP id QAA08265; Thu, 30 May 1996 16:27:24 -0700 (PDT) Message-Id: <199605302327.QAA08265@uruk.org> X-Authentication-Warning: uruk.org: Host loopback [127.0.0.1] didn't use HELO protocol To: Poul-Henning Kamp cc: freebsd-smp@freebsd.org Subject: Re: How do you get the SMP code In-reply-to: Your message of "Thu, 30 May 1996 20:28:41 -0000." <777.833488121@critter.tfs.com> Date: Thu, 30 May 1996 16:27:24 -0700 Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Poul-Henning Kamp writes: > > You have to do at least one level of pointer indirection for indexing > > which CPU you're using anyway, so this is mostly a redesign of how the > > mapping works. > I agree mostly to this, I just want to make sure that we don't > overengineer it. You're correct that overengineering is generally bad... but at this point in the game, making it flexible and robust (i.e. working :-) is much more important than performance. If it is a problem, we could rip it out later easily enough. > > > Nasty question time: Can part of the local apic registers be used > > > as "per-cpu registers" without too much performance penalty ? > > > > It is particular to each processor. On the Pentium and Pentium Pro, > > accesses are part of the external bus and L2 DCU path respectively. > > I.e. I think it is slower than the L1. > but faster than doing too much aritmetic on the apic_id. My point being > that any attempt to find the per-cpu data starts out with trying to read > the APIC_ID, so we might as well cache a pointer in the APIC and read > that instead... You generally do NOT have extra registers in the local APIC. If you're going to use it a lot, load the APIC id once in a routine (or for that matter, index a "logical CPU number" and keep it around instead). > I thought about the fact that we don't use %[gf]s in the kernel quite a > bit, one could make a segment per cpu and have the CPU's differ only in > %gs's contents. That way we just need to set %gs on entry to the kernel > (in trap/syscall/irq &c) and everything is (moderate) downhill from there, > with the footnote that we have no way of explaining to CC that it should > use the "gs:" prefix, so a lot of ugly inline assembler is needed for it. Possibly. Complexity might make it painful. One problem is that in general, prefixing with a segment slows things down somewhat. Though, if you don't change it much (or at all), it might be OK. > > > I would really love to have one or two 32bit registers local per CPU > > > to speed up all this stuff... > > > > Yes, yes. We've heard that quite a bit. > Oh, and while your're at it: add a nano-second clock, it doesn't have to > have nano-sec increments, just units of nano-secs. And if you have > more space on your silicon, we have more ideas as well :-) On both Pentium and Pentium Pro, you have the time-stamp counter. The only problem with this is during periods when your clock input is slowed down or halted, such as for power-down... but any other clock would probably suffer the same fate. > > FWIW: GCC has a lot of room for improvement... the "Pentium GCC" work > > showed that just taking advantage of some x86-isms can get you > > BOTH 10-30% denser code that's also 10-30% faster. > > Oh, sure! > > But performance by design is even better :-) > > If I can design it so that I only have one level of indirection, then > any version of any compiler will do better :-) The x86 line has always been compatibility first, performance second. Some of them are still pretty fast... but for the next several years, a hot-shot compiler is the best way to go. Someday they might add more registers/64-bit mode, etc. to the x86, but I'm not sure I'd hold my breath. -- Erich Stefan Boleyn \_ E-mail (preferred): Mad Genius wanna-be, CyberMuffin \__ (finger me for other stats) Web: http://www.uruk.org/~erich/ Motto: "I'll live forever or die trying" This is my home system, so I'm speaking only for myself, not for Intel.