Date: Sat, 14 Dec 1996 10:07:15 -0800 From: Erich Boleyn <erich@uruk.org> To: Steve Passe <smp@csn.net> Cc: peter@spinner.dialix.com, haertel@ichips.intel.com, smp@freebsd.org Subject: Re: TLB shootdown problems? (was -> Re: Tried SMP kernel from early morning CVS tree ) Message-ID: <E0vYyUV-0004qa-00@uruk.org> In-Reply-To: Your message of "Sat, 14 Dec 1996 03:44:43 MST." <199612141044.DAA14527@clem.systemsix.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Steve Passe <smp@csn.net> writes: > Hi, > > > Here's a question (I'm going to look this up myself, but thought it'd > > be worthwhile to see if you'd shed light on it before I get to it on > > my copious spare time ;-) ... > > > > How exactly are TLB shootdown IPIs implemented? (or are they any > > different from any other IPIs?) > > > > >From what I could see, it looks like the IPI is considered "finished" > > (and the function returns) when the APIC status is "delivered". This > > could be a problem, because the interrupt doesn't necessarily happen > > on the other CPU at that point (and it certainly isn't completed at > > that point). You really need some other mechanism to tell you that > > the operation has completed before you can continue. > > this is an accurate picture of the current situation. we just send it and > "assumme" that things are now 'OK'. We know this isn't correct, its just > step one on the way there. It made remarkable improvement on the P5 > machines. So I guess the next step is a rendezvous mechanism to control > this. If anyone could suggest an effective algorithm for it I could take > whack at programming it. Yes, that was what I thought. The easiest (and maybe best performing) thing to do is have the sender spin waiting on bits being twiddled in global memory, then have the target CPUs' IPI handlers do such twiddling. The real question at this point is: Can only one TLB shootdown be in progress at any one time. If so, a good example to look at is Linux-SMP: Linux-SMP has a bitwise (since SMP-capable x86es have bitwise test and test-and-set operators) mask "smp_invalidate_needed". There is one bit for each CPU. When an invalidate is needed on a particular CPU, the corresponding bit is set atomically. Whenever a TLB invalidate is made on a particular CPU, the corresponding bit is unset atomically. There are ways to play with that so not all CPUs need be sent messages all the time, plus Linux-SMP does TLB invalidates in it's global spinlock, etc. It also doesn't necessarily need to try to send the "smp_invalidate" message right after the pmap change, just when it expects to need to see it locally or globally... this allows time in which other CPUs could do invalidates. This kind of thing would provide a moderate base on which to make it more fine-grained over time. A simple version which could use the same mechanism would be to have the IPI handler do the right thing, but just have the "smp_invalidate" message set all the "smp_invalidate_needed" bits (except our own!) for now, to get everything working. Avoiding setting bits for CPUs which don't need invalidates could be done later this way without changing the reception mechanism at all. For a kernel architecture which is multi-threaded/re-entrant, then things get more complicated. I still have an algorithm in mind, but it's just a bit long to put here right now (essentially, you have to be able to guarantee if there are multiple TLB invalidates flying around, that both the right things happen, and they both terminate reasonably). -- Erich Stefan Boleyn \_ E-mail (preferred): <erich@uruk.org> Mad Genius wanna-be, CyberMuffin \__ (finger me for other stats) Web: http://www.uruk.org/~erich/ Motto: "I'll live forever or die trying"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E0vYyUV-0004qa-00>