From owner-freebsd-current@FreeBSD.ORG Thu Jul 10 12:21:58 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A598437B401 for ; Thu, 10 Jul 2003 12:21:58 -0700 (PDT) Received: from mail.speakeasy.net (mail12.speakeasy.net [216.254.0.212]) by mx1.FreeBSD.org (Postfix) with ESMTP id 01C9443F85 for ; Thu, 10 Jul 2003 12:21:58 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 5734 invoked from network); 10 Jul 2003 19:21:57 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender )encrypted SMTP for ; 10 Jul 2003 19:21:57 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.9/8.12.9) with ESMTP id h6AJLtGI002529; Thu, 10 Jul 2003 15:21:55 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: Date: Thu, 10 Jul 2003 15:22:09 -0400 (EDT) From: John Baldwin To: Julian Elischer cc: FreeBSD current users Subject: RE: SMP and setrunnable()- scheduler 4bsd X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jul 2003 19:21:58 -0000 On 10-Jul-2003 Julian Elischer wrote: > OK so I return with some numbers.... > > > On Tue, 8 Jul 2003, John Baldwin wrote: > >> >> On 08-Jul-2003 Julian Elischer wrote: >> > It looks tp me that if we make a thread runnable >> > and there is a processor in the idle loop, the idle processor should be >> > kicked in some way to make it go get the newly runnable thread. >> > >> > If the processors are halting in the idle loop however, it may take >> > quite a while for the new work to be noticed.. >> > (possibly up to milliseconds I think) >> > >> > Is there a mechanism to send an IPI to particular processors? >> > or is it just broadcast? >> > >> > >> > I think we would be better served to alter idle_proc(void *dummy) >> > (or maybe choosethread()) to increment or decrement a count >> > of idle processors (atomically of course) so that >> > setrunnable (or it's lower parts) can send that IPI >> > and get the idle processor into actioan as soon as a thread is >> > available. >> > >> > I have not seen any such code but maybe I'm wrong.... >> >> This is why HLT is not enabled in SMP by default (or at least was, >> it may be turned on now). Given that the clock interrupts are >> effectively broadcast to all CPU's one way or another for all >> arch's (that I know of), you will never halt more than the interval >> between clock ticks on any CPU. > > > So here are some figures.. > dual# sysctl machdep.cpu_idle_hlt > machdep.cpu_idle_hlt: 1 > 307.773u 93.000s 4:22.17 152.8% 3055+5920k 51+1046io 284pf+0w > 307.762u 93.082s 4:23.22 152.2% 3061+5925k 4+1012io 8pf+0w > > dual# sysctl machdep.cpu_idle_hlt=0 > machdep.cpu_idle_hlt: 1 -> 0 > > 357.264u 115.377s 4:25.21 178.2% 3150+5982k 7+1021io 8pf+0w > 356.193u 116.551s 4:24.70 178.5% 3145+5980k 5+991io 8pf+0w > > reboot to kernel with IPIs for idle processors.. (patch available) > > dual# sysctl machdep.cpu_idle_hlt > machdep.cpu_idle_hlt: 1 > > 308.113u 90.422s 4:19.46 153.5% 3061+5941k 13+989io 22pf+0w > 308.430u 93.501s 4:22.86 152.9% 3045+5897k 70+1022io 8pf+0w > > dual# sysctl machdep.cpu_idle_hlt=0 > machdep.cpu_idle_hlt: 1 -> 0 > 357.809u 113.757s 4:24.12 178.5% 3148+6020k 31+1016io 8pf+0w > 356.193u 115.195s 4:24.22 178.4% 3150+5983k 30+1029io 8pf+0w > > dual# sysctl machdep.cpu_idle_hlt=1 > machdep.cpu_idle_hlt: 0 -> 1 > 308.132u 92.196s 4:23.15 152.1% 3044+5910k 30+1033io 8pf+0w > 307.504u 93.581s 4:23.22 152.3% 3047+5913k 29+1055io 8pf+0w > > What is so stunning is the massive increase in user time > for the case where the cpu is not being idled. > I'm hoping this is a statistical artifact of some sort.. I don't think it is, but you'd need more samples to be truly confident. One possible reason: having the CPU's not halt means that idle CPU's bang on the runq state continuously. Perhaps this can penalize the non-idle CPU's due to cache interactions both when the non-idle CPU's are manipulating the queues and also by making the cache lines holding the queue state always be resident and not allowing their effective use by the real code executing on other CPUs. > either way, the times are almost identical. > Having the cpu halt during idle time seems to be > slightly faster (1 second out of 250? not too significant) > It would however be good to see thread wakeup latency times. > (I'll work on that) > > The patch to send an IPI when an thread becomes runnabel and there > are idle CPUs seems to not hurt this case at least. > it may however make a lot of difference in the case of > KSE threads waking each other up.. > I'll do some tests. Yes. As it stands now, adding the IPI would just make things more complex for no gain. However, if this IPI is present, then we can engage in perhaps more drastic measures like really putting a CPU to sleep (perhaps disabling interrupts to it?) until it is needed which might bring significant power and heat savings to idle SMP machines. > It seems however that having the halt on idle turned on is the > right thing these days. (which is the current default) > but the odd user times are a worry. I'm sure Terry is all torn up by that conclusion. :-P -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/