Date: Fri, 28 Apr 2000 21:08:06 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: dillon@apollo.backplane.com (Matthew Dillon) Cc: jgowdy@home.com (Jeremiah Gowdy), smp@csn.net (Steve Passe), jim@thehousleys.net (James Housley), freebsd-smp@FreeBSD.ORG Subject: Re: hlt instructions and temperature issues Message-ID: <200004282108.OAA01313@usr08.primenet.com> In-Reply-To: <200004280142.SAA07744@apollo.backplane.com> from "Matthew Dillon" at Apr 27, 2000 06:42:48 PM
next in thread | previous in thread | raw e-mail | index | archive | help
> :In this piece of code: > :------------------------------------ > :ENTRY(default_halt) > :sti > :#ifndef SMP > :hlt /* XXX: until a wakeup IPI */ > :#ifdef SMP > :#ifdef CHEAP_TPR > :movl $0, lapic_tpr > :#else > :andl $~APIC_TPR_PRIO, lapic_tpr > :#endif /** CHEAP_TPR */ > :#endif > :hlt > :ret > > Umm... where'd you get the above code? This is not the current > halt code for 3.x, 4.x, or 5.x. This was Loqui's patch; in it he suggested replacing (in swtch.s): ENTRY(default_halt) sti #ifndef SMP hlt /* XXX: until a wakeup IPI */ #endif ret With: ENTRY(default_halt) sti #ifdef SMP #ifdef CHEAP_TPR movl $0, lapic_tpr #else andl $~APIC_TPR_PRIO, lapic_tpr #endif /** CHEAP_TPR */ #endif hlt ret Some people have (correctly) pointed out that this would slow down SMP operations, since it reduces halted CPU's to "wake on int". This is correct. Some people have also pointed out that the TPR is already 0 when the "hlt" would have been executed. I'm not positive about this in the "just finished handling a fastintr" case. Others have complained about the "air gap" between the "sti" and the "hlt". I think that this is not really an issue, but it's very easy to rectify this, if it were. It's clearly not an issue if the TPR claims are correct, and the new code merely removes the "#ifdef SMP/#endif" directives. The comment "until a wakeup IPI" applies to the case: when releasing the BGL while leaving the scheduler, with a process still on the ready-to-run queue, and a CPU that could take it having been halted ..at least in the scheduling code as it currently sits. So it's pretty trivial to fix the "slows to a crawl" problem, and for the person with the 8 processor system to verify that it is fixed for us (having seen the "slows to a crawl" problem, in person). The comment about the TPR level for the lock holder vs. the "hlt"'ed processor is a valid point. I think that there is, however, on an NCPU > 2 machine, a new "thundering herd" problem, if all halted CPU's have a TPR of 0, and the IPI is a broadcast IPI that wakes them all. I would be very tempted to have broadcast IPIs of a high level, with the lock holder at a higher level (2 * NCPU + 1), and an unblocked processor at yet a higher level, and then an entry count for the TPR for processors, as they "get in line" for the "hlt". Then you could IPI with the min of the number of processes waiting in the ready-to-run state plus the number of processors. Each CPU would subtract the IPI level from their TPR, and, if zero or less, "go live" on one of the ready-to-run processes after setting the highest ("running") TPR. Otherwise, the CPU would decrement their TPR by the remainder, and go back to sleep. This would provide a generic "wakeone/waken/wakeup" mechanism, which should be the most efficient for a single, system wide ready to run queue. We don't care that we wakeup the CPUs with no work to do and send them right back to sleep, since they weren't doing anything valuable anyway. This scheme would not provide completely optimal "hlt"-ness, but it would provide the largest amount of "hlt"-ness which would not unduly slow the system relative to no "hlt" at all. It seems to me the best trade-off between running temperature, vs. the optimal amount of work you can squeeze out of the system. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200004282108.OAA01313>