From owner-freebsd-current Tue May 28 1:10:58 2002 Delivered-To: freebsd-current@freebsd.org Received: from rina.r.dl.itc.u-tokyo.ac.jp (rina.r.dl.itc.u-tokyo.ac.jp [133.11.199.247]) by hub.freebsd.org (Postfix) with ESMTP id 722C037B407; Tue, 28 May 2002 01:10:49 -0700 (PDT) Received: from rina.r.dl.itc.u-tokyo.ac.jp (localhost [127.0.0.1]) by rina.r.dl.itc.u-tokyo.ac.jp (8.12.3+3.5Wbeta/3.7W-rina.r-Nankai-Koya) with ESMTP id g4S8Ah3i071756 ; Tue, 28 May 2002 17:10:43 +0900 (JST) Message-Id: <200205280810.g4S8Ah3i071756@rina.r.dl.itc.u-tokyo.ac.jp> Date: Tue, 28 May 2002 17:10:42 +0900 From: Seigo Tanimura To: John Baldwin Cc: Seigo Tanimura , current@FreeBSD.org Subject: Re: preemption across processors In-Reply-To: References: <200205151106.g4FB6Z3i059559@rina.r.dl.itc.u-tokyo.ac.jp> User-Agent: Wanderlust/2.8.1 (Something) SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (=?ISO-8859-1?Q?Unebigory=F2mae?=) APEL/10.3 MULE XEmacs/21.1 (patch 14) (Cuyahoga Valley) (i386--freebsd) Organization: Digital Library Research Division, Information Techinology Centre, The University of Tokyo MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 15 May 2002 08:21:46 -0400 (EDT), John Baldwin said: jhb> On 15-May-2002 Seigo Tanimura wrote: >> Currently, a new runnable thread cannot preempt the thread on any >> processor other than the thread that called mi_switch(). For >> instance, we do something like the following in _mtx_unlock_sleep(): >> >> --- v --- _mtx_unlock_sleep() --- v --- >> setrunqueue(th_waken_up); >> if (curthread->preemptable && th_waken_up->priority < curthread->priority) { >> setrunqueue(curthread); >> mi_switch(); >> } >> --- ^ --- _mtx_unlock_sleep() --- ^ --- >> >> If the priority of curthread is higher than th_waken_up, we cannot run >> it immediately even if there is another processor running a thread >> with a priority lower than th_waken_up. th_waken_up should preempt >> that processor, or we would end up with a priority inversion. >> >> Maybe we have to dispatch a runnable thread to the processor running >> a thread with the lowest priority. Solaris seems to take the >> following steps to do that: >> >> 1. If a new thread has slept for longer than 3/100 seconds (this >> should be tunable), linearly search the processor running a thread >> with the lowest priority. Otherwise, choose the processor that ran >> the new thread most recently. >> >> 2. Make an inter-processor interrupt to the processor chosen in 1. >> >> 3. The chosen processor puts its current thread back to the dispatch >> queue and performs a context switch to run the new thread. >> >> Above is only a rough sketch. We have to watch out for a race of >> inter-processor interrupts and a processor entering a critical section. >> >> If no one is working on preemption across processors, I would like to >> see if I can do that. jhb> I actually think that the little gain this brings isn't worth the extra jhb> effort involved personally. We don't have to get things perfect, getting jhb> them reasonably close is good enough for some things. However, that is jhb> only my opinion. If the code to support this is relatively clean and jhb> simple with low-impact in the normal case then I would support it. However, jhb> there are several tricky race conditions here so I'm not sure it can be jhb> done simply. The prototype patch is at: http://people.FreeBSD.org/~tanimura/patches/ippreempt.diff.gz And the p4 depot //depot/user/tanimura/ippreempt/... The patch is for only i386 at the moment. The following is the brief description of the patch: --- v --- Description --- v --- Overview: setrunqueue() finds for a newly runnable thread the processor running the thread with the lowest priority by chooseprocessor(). setrunqueue() then marks the priority of the new thread on the processor chosen for preemption. If the processor chosen is not the current processor, setrunqueue() notifies the processor by making a preemption IPI to the processor chosen, where the IPI handler calls dispatchthread(). If the current processor is chosen for preemption, setrunqueue() directly calls dispatchthread(). dispatchthread() grabs the thread with the highest priority from the run queue. If the current thread is running and has a higher priority than the thread grabbed, dispatchthread() returns. Otherwise, dispatchthread() puts the current thread back to the run queue (if it is not an idle thread) and switches to the thread grabbed. If the current thread is going to sleep, (i.e. its state is SSLEEP, SSTOP, etc.) we always switch to the thread grabbed. Implementation: Call dispatchthread() instead of mi_switch() in msleep(), cv_*wait*(), etc. in order to give up the current processor. setrunqueue() no longer requires maybe_resched() in wakeup() and the preemption check in _mtx_unlock_sleep(). If it is not appropriate to preempt the current processor, call setrunqueue() in a critical section. Note that setrunqueue() may dispatch the thread passed to a processor other than the current one. Miscellaneous stuff: If a thread spins for an adaptive mutex, propagate its priority to the owner thread of the mutex. This prevents preemption of the owner thread by a thread with the priority in between the owner thread and the spinning thread. In order to make a space in the IPI priority for a preemption IPI, raise the IPI priority of Xcpustop and Xinvltlb by one. An idle processor no longer has to check whether or not there is a runnable thread. Halt an idle processor in an SMP kernel as in a UP kernel. --- ^ --- Description --- ^ --- The time taken for configuring, depending, compiling and linking a GENERIC kernel was measured by time(1) for the vanilla kernel and the patched one. Both of the kernels omit INVARIANTS, INVARIANT_SUPPORT and WITNESS*. The spec of the test machine is: CPU: dual Pentium II 450MHz RAM: 256MB HDD: one IDE 2GB Tests were done in the single-user mode immediately after reboot. Make(1) was run with -j16 for compilation and linking by the kernel-depend target. The following results are the averages of five tests in seconds: On the vanilla kernel: Real: 552.10 User: 872.34 Sys: 84.81 On the patched kernel: Real: 553.38 User: 873.96 Sys: 85.16 Since the results of the vanilla kernel had a range of about one second around the average, we can say that the patched kernel achieves almost the same performance as the vanilla one. -- Seigo Tanimura To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message