Date: Wed, 19 Jul 2006 17:11:11 +0100 From: Gareth McCaughan <gmccaughan@synaptics-uk.com> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: "swiN: clock sio" process taking 75% CPU Message-ID: <200607191711.11966.gmccaughan@synaptics-uk.com> In-Reply-To: <200607181441.26875.jhb@freebsd.org> References: <200607181317.33416.gmccaughan@synaptics-uk.com> <200607181804.44813.gmccaughan@synaptics-uk.com> <200607181441.26875.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 2006-07-18 19:41, John Baldwin wrote: > On Tuesday 18 July 2006 13:04, Gareth McCaughan wrote: > > On Tuesday 2006-07-18 16:54, Deomid Ryabkov wrote: > > > Gareth McCaughan wrote: > > > > > > > About 6 minutes after booting (on three occasions, but I > > > > don't guarantee this doesn't vary), a process (well, a > > > > kernel interrupt thread, I guess) that appears in the > > > > output of "ps" as "[swi4: clock sio]" begins to use > > > > about 3/4 of the machine's CPU. ... > In that case, something is scheduling a lot of timeouts (via callout_reset() > or timeout()) or you have timeout handlers that are taking a very long time > to run. There aren't any easy ways to debug this. :-P You can try turning > on the DIGANOSTIC check in kern_timeout.c to catch long-running timeouts, and > you can try adding some KTR traces to softclock() to see which timeout > functions are running and try to do some analysis on that. Thanks for the excellent advice! So, I turned on DIAGNOSTIC. That produced only two messages, both implicating what turns out to be the scrn_timer function from syscons.c. Once it took about 10ms, once about 100ms. There may for all I know have been an awful lot of 10ms-ish ones, since the threshold doubles on each report. (I'll check, but not today. If the answer is that the DIAGNOSTIC check just happened to catch a couple of freakishly long times, then we've found a situation where timeout functions can take too much CPU despite no single one being bad enough to be noticed; in that case, it might be worth enhancing the code in softclock() a bit.) After a quick glance at the code for scrn_timer, I tried disabling the console screen saver (saver="NO" in rc.conf), and lo! all is now well. It seems to me that one of two things should be done. 1. If this is considered pilot error: Put a big warning somewhere saying that the screen saver makes no attempt to avoid eating all your CPU even when the machine is heavily loaded, and that it should therefore not be used if your machine will ever be used unattended. 2. If not: Find out why the syscons screen saver is taking so many cycles on my machine, and find a way to stop it. I'd be up for putting a bit of work into #2, but if the consensus is that I was a twit to think that I could use a machine for real work with the screen saver enabled then maybe #1 would do almost as well. (The particular screen saver I turned on was the one called "warp"; I haven't checked yet whether others have the same CPU-guzzling effect.) -- g
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200607191711.11966.gmccaughan>