Date: Mon, 9 Nov 2015 16:22:21 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> Cc: freebsd-bugs@freebsd.org Subject: Re: [Bug 204376] Cavium ThunderX system heavily loaded while at db> prompt Message-ID: <20151109134228.U969@besplex.bde.org> In-Reply-To: <bug-204376-8-9NElBrKWdX@https.bugs.freebsd.org/bugzilla/> References: <bug-204376-8@https.bugs.freebsd.org/bugzilla/> <bug-204376-8-9NElBrKWdX@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 8 Nov 2015 comment-ignorer@freebsd.org wrote: > --- Comment #1 from NGie Cooper <ngie@FreeBSD.org> --- > It's not just arm64; amd64 does/did a horrible job at yielding when in the > debugger (part of the reason why we have a script which goes and suspends test > VMs at $work if/when they panic). I thought it worked OK since it uses cpu_pause(). But Haswell runs much hotter with cpu_pause() than with an empty loop. monitor+mwait works better. For normal idling, here it runs about 10 degrees C hotter than ACPI C2. Loops in userland without cpu_pause() (foo: jmp foo) run 25-30 degees hotter. Loops in userland with cpu_pause() (foo: pause; jmp foo) run 25-35 degrees hotter. ddb is about the same as this. But normal idling with cpu_pause() runs 40-45 degrees hotter. So ddb saves power compared with misconfigured normal idling :-). > Conrad had a patch out for amd64 a few months ago which yielded in the debugger > a bit on amd64, but IIRC there were issues at the time. I'll let him comment on > it though. > > It would be nice if dropping into the debugger didn't spin all the CPUs at > ~100% though. CPUs stopped by the the debugger or anything else cannot do any normal yielding or interrupt handling. Perhaps they can halt with interrupts disabled and depend on an NMI to wake them up. NMIs are hard to program, especially near kdb, but stopping CPUs already depends on NMIs to promptly stop ones that are spinning with interrupts disabled. When a CPU is stopped, it must not handle any NMIs, but just let them kick it out of a halt, and halt again unless the start/stop masks allow them to proceed. I think all console drivers are too stupid to use cpu_pause() in spinloops waiting for input, so sitting at the debugger prompt burns at least 1 CPU, but since cpu_pause() apparently doesn't work very well it might not make much difference. cpu_pause() seemed to help when I used it to make the busy-waiting in DEVICE_POLLING in idle less harmful. Perhaps it works better in i/o loops. That might help the console drivers' i/o loops too. But userland tests shows that doing a slow ISA i/o is cool enough by itself (perhaps a little cooler than the dumb spinloop, and now adding the pause makes little difference). It seems reasonable to expect the CPU to mostly shut down itself when waiting 5000+ cycles for slow i/o if neither is emulated, and blame the emulation if it uses too many cycles or too few cycles to emulate this. The normal idle spinloop must be doing something stupid to run so much hotter. It calls sched_runnable() in a loop. This obviously uses more CPU resources than "foo: jmp foo", but in other tests I never saw much difference caused by the instruction mix. Apparently sched_runnable() uses lots of CPU resources and pausing probably makes little difference. Normal idle needs to wake up fast so it needs to check a lot, but even the function call for this is wasteful. Stopped CPUs don't need to restart so fast (except for faster tracing in ddb -- it is already about 1000 times too slow). Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151109134228.U969>