Date: Wed, 03 Dec 1997 15:15:01 -0800 From: Joe Eykholt <jre@Ipsilon.COM> To: Steve Passe <smp@csn.net> Cc: smp@freebsd.org Subject: Re: SMP Message-ID: <3485E7F5.15FB7483@ipsilon.com> References: <199712031657.JAA08702@Ilsa.StevesCafe.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Steve, You wrote: > 3: begin the design of "the real thing". My current lock-pushdown attempts > to co-exist with the splxxx paradigm. I'm pretty much convinced at this > point that the work I've done in this area is only useful to prove it > ain't gonna' cut it. We need to design IN DETAIL a shift to a mutex based > kernel. One obvious question is whether we move both UP and SMP that > direction, or just SMP. There are many,many other questions to be > answered. > It would be nice if we could progress on this issue in a serious manner, > getting a design wrapped up b4 I have to go back to my real job... I've been playing with the SMP code a bit. I have some design suggestions (and some code snippets) that might or might not be useful. Some you've probably already got lots of ideas about. For what they're worth: 1. I agree that the mutex_lock/unlock based approach would be nice, and more explicit, and possibly better even in the UP case. Hopefully a persuasive design document can convince people it's worth the pain. It should be possible to have spin-type mutexes directly correspond to the various splxxx routines at first, and then break them up into finer-grained locks. I'd use the pthreads_mutex_lock() interfaces (except perhaps leave the pthreads_ prefix off of the names). The locks would be initialized with information about which interrupts are automatically blocked while the lock is held, etc. BTW, I don't think locks should be allowed to be recursively grabbed. 2. Whether a first-level interrupt handler automatically gets a lock blocking all similarly-registered (same imask) interrupts or not is an area to consider. This is analogous to raising CPL before re-enabling. I'd like to investigate very different interrupt models, where interrupts are scheduled as separate threads, similar to but not exactly like Solaris. I think this is possible without massive driver changes. 3. I'd change the CPU-private variables to be inside a structure (struct cpu). There should be two structures, actually, one which is portable and one which is machine-dependent. The portable one can be inside the machine-dependent one, or vice-versa. This way, per-CPU variables are easier to identify. (e.g. cpl and ipending are per-CPU, and seeing CPU->cpu_cpl makes this more clear. 4. The per-CPU mapping causes problems with rfork and with multi-threading in general, and may also hurt context-switching. I prefer the approach of reserving a segment and a segment register (%gs) to select the per-CPU structure throughout the kernel. Unfortunately this does require loading %gs using the local APIC ID on every kernel entry. It does make some accesses somewhat more expensive ... I'm not sure how much. Solaris does this (which is what made me think of it). On non-i386 architectures, there's usually some other register available which the compiler doesn't use (%g7 on SPARC, %r2 on PowerPC) that can point to the per-CPU or per-thread data. Actually, pointing at the per-thread (or per-process until there are kernel threads) data rather than per-CPU data is better in the long term. This is because then preemption isn't a problem ... you can be preempted and moved to another CPU and your per-thread data is still in the same place. You can find the current CPU pointer through your per-thread data. Inlining references through the %gs register can actually reduce code size from the current curproc method. 5. The APIC vectors need to be re-arranged into a priority order so that interrupts don't need to access the I/O APIC to mask off the interrupt during handling. (Or maybe there's a better way to do this). I noticed that level-sensitive interrupts are getting taken twice, because the first interrupt only masks with CPL, so after the EOI and sti, the interrupt is still pending and is taken again. Only the second time does it get masked in the I/O APIC. Being a central resource, locking the IOAPIC on every interrupt is unacceptable, so restructuring the interrupts in a priority order and deferring the EOI until the end seems necessary. I don't completely like this, so I'm hoping there's a better way. If you'd like to discuss any of these areas further, I'd be happy to. I have some code developed around issues #3 and #4, which you could have. Thanks, Joe Eykholt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3485E7F5.15FB7483>