Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 14 Dec 1996 10:29:25 -0800
From:      Erich Boleyn <erich@uruk.org>
To:        Peter Wemm <peter@spinner.dialix.com>
Cc:        smp@freebsd.org, haertel@ichips.intel.com
Subject:   Re: some questions concerning TLB shootdowns in FreeBSD 
Message-ID:  <E0vYypx-0004tV-00@uruk.org>
In-Reply-To: Your message of "Sat, 14 Dec 1996 23:03:51 %2B0800." <199612141503.XAA17454@spinner.DIALix.COM> 

next in thread | previous in thread | raw e-mail | index | archive | help

Peter Wemm <peter@spinner.dialix.com> writes:

> 1: It's async.  it does not syncronise the remote processors as it must do,
> or they can get out of sync, slave processors can do updates on stale
> data, etc.

As mentioned in another message, this is bad.

> 2: It does too much work.  There are a lot of cases where a global flush
> is done for the local user process on the local cpu.   I am not 100% sure
> whether this is needed or not.  I can imagine that APTD accesses might
> present a problem if we try to avoid global flushes here.

This is perfectly OK from a functional point of view.  Personally, I
think efficiency is less important than getting it to work at this point.

> There was the query about the possibility of speculative execution
> on the PPro being the problem that is breaking the kernel. The
> scenario sounds plausable, but my initial reaction to that was that
> we are doing this from an _interrupt handler_, and I would be very
> suprised if speculative execution from the original code thread
> isn't wound up before going into the interrupt...  If not, do we
> need some strategic nop's?

No!  Speculative execution which broke interrupt handlers would be very bad,
in a lot of systems.  Perhaps Mike Haertel can comment more clearly, but my
memory claims these kind of actions were serialized.

There are actually some cases which can break, but as far as I know
these are all bus-propagation issues to external devices.

However, I think the IPI can be considered delivered, and that doesn't
guarantee that the CPU has been interrupted (what about interrupts
being masked, for example?).  I think it just says the interrupt was
accepted by the queue on the other APIC.

> I'm still digesting it,  I am almost worried that we might (shudder!)
> be forced into doing an IPI to stop all the cpu's *before* the
> current cpu changes the page tables, then letting them do the tlb
> flush and letting them proceed.  If this actually is a real problem
> this means a much bigger code impact.

I don't think so, but to allay your fears, note that if some page
permissions are changed:

  1)  Increasing permission is OK, because that should simply cause a
	false page-fault.
  2)  Decreasing permissions can cause the situation where thread/process
	A (perhaps a kernel thread) can be trying to deallocate a page
	in thread/process B which is in progress of accessing the data
	in that page (or might be).

#2 might be considered a race condition, but it also looks like a
natural timing problem that you can't get around anyway.

As long as there is some real rondezvous mechanism (such as mentioned in
my last message) for TLB shootdown IPIs to be acknowledged before the
sending CPU continues, you're guaranteeing that the original thread can't
continue until the other CPU's TLBs are really cleared, which is all
that seems important.

--
  Erich Stefan Boleyn                 \_ E-mail (preferred):  <erich@uruk.org>
Mad Genius wanna-be, CyberMuffin        \__      (finger me for other stats)
Web:  http://www.uruk.org/~erich/     Motto: "I'll live forever or die trying"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E0vYypx-0004tV-00>