Date: Thu, 12 Sep 2002 05:18:34 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: Luigi Rizzo <rizzo@icir.org> Cc: smp@FreeBSD.ORG Subject: Re: wakeup handling on SMP boxes Message-ID: <20020912041647.U3346-100000@gamplex.bde.org> In-Reply-To: <20020911083854.A88921@iguana.icir.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 11 Sep 2002, Luigi Rizzo wrote: > I have a question about the handling of wakeup on SMP machine. > ... > My understanding of the behaviour is that: > > + the processor handling the wakeup will suspend the > curproc and, eventually, invoke need_resched(); The suspension is only logical. Physically, the current process just runs wakeup() to completion. Any context switching before wakeup() returns is accidental in -current and a bug in RELENG_4. > + on this same processor, the priority of the newly awaken process > is compared with the one of the suspended process; > > + if the comparison succeeds, the suspended process is preempted > and the new one runs; otherwise, the new process will have a > chance at the next voluntary descheduling or roundrobin(); If the newly awaken process has a higher priority than the current process, then the current process's rescheduling flag is set and the current process is switched away from some time later, often (typically I think) not until it returns to use mode. In -current, a switch may also occur as a side affect of locking or in response to an interrupt. In RELENG_4, it is fairly fundamental that processes in the kernel don't get preempted. > Am I correct ? > This seems to suggest that the priority ordering might be violated > for as much as kern.quantum, after which the roundrobin() and > forward_roundrobin() will do the right thing. Only if the process stays in the kernel that long. The quantum is not very relevant here, since it doesn't affect processes in the kernel (until they leave the kernel). The "hogticks" hack that I added to uiomove() works around this in some cases. It is possible for a process to spend a long time in uiomove() (e.g., 1 second for reading 10-20MB from /dev/zero on a slow 486). Things are better in -current. Some interrupts cause context switching (fairly soon if not immediately), and after the interrupt is handled the kernel just runs the highest priority running process in a round robin fashion (instead of the previous one; it really should run the previous process again if it has equally highest priority and its quantum has not expired, since not doing so bogotifies the quantum). Blocking on locks also causes context switching. > The only reason why this more or less works in practice is that the > sleeping process likely has raised its priority in the tsleep() > call, so it will preempt the process running on the processor > handling the wakeup(). On the other hand, there is no guarantee > that this process is the one with the lowest priority among those > currently running. Processes are supposed to sleep at a high (or suitable) priority so that they run soon (enough) after they wake up. This works OK except for the above problem and the bugfeature that processes retain their high sleep priority for a long time after they wake up (typically all the way back to user mode!). > I guess to fix this one would need to determine if one of the > processes needs to be kicked out and replaced with the new one, > by invoking an Xcpuast IPI on the specific processor. > > Any reason why this is not done ? Is the call too expensive so > one prefers to tolerate the temporary inconsistency ? I think it would be too expensive to switch contexts on every significant priority change even under !SMP. Rescheduling under SMP now works much the same under SMP as under !SMP -- a flag is set but doesn't cause the current process to give up control until it reaches the user boundary. But there is a difference if the process is in user mode when the wakeup occurs. This can only happen in the SMP case. An Xcpuast IPI would work like any other interrupt for kicking the process into kernel mode so that it checks the flag on the way back. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020912041647.U3346-100000>