From owner-freebsd-smp Sat Dec 14 09:26:12 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id JAA17392 for smp-outgoing; Sat, 14 Dec 1996 09:26:12 -0800 (PST) Received: from ormail.intel.com (ormail.intel.com [134.134.248.3]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id JAA17384 for ; Sat, 14 Dec 1996 09:26:08 -0800 (PST) From: haertel@ichips.intel.com Received: from ichips.intel.com (ichips.intel.com [134.134.50.200]) by ormail.intel.com (8.8.4/8.7.3) with ESMTP id JAA12513; Sat, 14 Dec 1996 09:25:48 -0800 (PST) Received: from pdxcs078.intel.com by ichips.intel.com (8.7.4/jIII) id JAA28180; Sat, 14 Dec 1996 09:23:01 -0800 (PST) Received: by pdxcs078.intel.com (AIX 3.2/UCB 5.64/SW1.11) id AA57406; Sat, 14 Dec 1996 09:25:51 -0800 Date: Sat, 14 Dec 1996 09:25:51 -0800 Message-Id: <9612141725.AA57406@pdxcs078.intel.com> To: peter@spinner.dialix.com Subject: Re: some questions concerning TLB shootdowns in FreeBSD Cc: dg@root.com, smp@freebsd.org Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >I'm still digesting it, I am almost worried that we might (shudder!) >be forced into doing an IPI to stop all the cpu's *before* the >current cpu changes the page tables, then letting them do the tlb >flush and letting them proceed. If this actually is a real problem >this means a much bigger code impact. You must do precisely this. The x86 architecture includes some complex instructions that reference the same memory locations more than once--read-modify-write sequences are the most obvious example. For various reasons, there is no guarantee that the TLB entries associated with those memory locations are locked in the TLB, and so they might be thrashed out due to other activity while those complex instructions are executing. If, in the meantime, some other processor has manipulated the associated PTE in any way that lowers privilege or changes the mapping, this processor could get a page fault in a *non restartable* way, since it would see the mapping and/or privilege changing under foot, but have already committed to finishing the instruction (since the privilege checks are normally only done at the beginning of the instruction). As for your other question: speculative execution does not continue past an interrupt. An interrupt is a totally serializing event. However, once you're in the interrupt handler, speculative execution could go down a different path than you think of the interrupt as actually taking. Basically every time the processor fetches something from the Icache that it thinks *might* contain a branch, it is an opportunity for the processor to go off into la-la land, since it will simply ask the branch predictor what it thinks and go that way. The effect of this is speculative pollution of the non-renamed state of the processor like the cache and the TLB entries. So, for example, in the uniprocessor case, doing this: 1. flush TLB 2. manipulate PTE is not safe, since after (1), the processor may waltz speculatively off to some code that actually references the PTE before you manipulate it. Instead you must always: 1. Manipulate PTE 2. flush TLB On multiprocessors, there is the additional concern of corrupting state which must remain invariant during instruction execution on other processors. So then you need the fully bulletproof code: 1. IPI to everyone sharing these specific PTE's 2. wait at barrier until everyone arrives 3. manipulate PTE 4. release barrier 5. everyone (including us) flushes TLB's Bleah, I know.