From owner-freebsd-smp Sat Dec 14 09:27:43 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id JAA17548 for smp-outgoing; Sat, 14 Dec 1996 09:27:43 -0800 (PST) Received: from uruk.org (root@faustus.dev.com [198.145.95.253]) by freefall.freebsd.org (8.8.4/8.8.4) with SMTP id JAA17536 for ; Sat, 14 Dec 1996 09:27:38 -0800 (PST) Received: from uruk.org [127.0.0.1] (erich) by uruk.org with esmtp (Exim 0.53 #1) id E0vYypx-0004tV-00; Sat, 14 Dec 1996 10:29:25 -0800 To: Peter Wemm cc: smp@freebsd.org, haertel@ichips.intel.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD In-reply-to: Your message of "Sat, 14 Dec 1996 23:03:51 +0800." <199612141503.XAA17454@spinner.DIALix.COM> Date: Sat, 14 Dec 1996 10:29:25 -0800 From: Erich Boleyn Message-Id: Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Peter Wemm writes: > 1: It's async. it does not syncronise the remote processors as it must do, > or they can get out of sync, slave processors can do updates on stale > data, etc. As mentioned in another message, this is bad. > 2: It does too much work. There are a lot of cases where a global flush > is done for the local user process on the local cpu. I am not 100% sure > whether this is needed or not. I can imagine that APTD accesses might > present a problem if we try to avoid global flushes here. This is perfectly OK from a functional point of view. Personally, I think efficiency is less important than getting it to work at this point. > There was the query about the possibility of speculative execution > on the PPro being the problem that is breaking the kernel. The > scenario sounds plausable, but my initial reaction to that was that > we are doing this from an _interrupt handler_, and I would be very > suprised if speculative execution from the original code thread > isn't wound up before going into the interrupt... If not, do we > need some strategic nop's? No! Speculative execution which broke interrupt handlers would be very bad, in a lot of systems. Perhaps Mike Haertel can comment more clearly, but my memory claims these kind of actions were serialized. There are actually some cases which can break, but as far as I know these are all bus-propagation issues to external devices. However, I think the IPI can be considered delivered, and that doesn't guarantee that the CPU has been interrupted (what about interrupts being masked, for example?). I think it just says the interrupt was accepted by the queue on the other APIC. > I'm still digesting it, I am almost worried that we might (shudder!) > be forced into doing an IPI to stop all the cpu's *before* the > current cpu changes the page tables, then letting them do the tlb > flush and letting them proceed. If this actually is a real problem > this means a much bigger code impact. I don't think so, but to allay your fears, note that if some page permissions are changed: 1) Increasing permission is OK, because that should simply cause a false page-fault. 2) Decreasing permissions can cause the situation where thread/process A (perhaps a kernel thread) can be trying to deallocate a page in thread/process B which is in progress of accessing the data in that page (or might be). #2 might be considered a race condition, but it also looks like a natural timing problem that you can't get around anyway. As long as there is some real rondezvous mechanism (such as mentioned in my last message) for TLB shootdown IPIs to be acknowledged before the sending CPU continues, you're guaranteeing that the original thread can't continue until the other CPU's TLBs are really cleared, which is all that seems important. -- Erich Stefan Boleyn \_ E-mail (preferred): Mad Genius wanna-be, CyberMuffin \__ (finger me for other stats) Web: http://www.uruk.org/~erich/ Motto: "I'll live forever or die trying"