FreeBSD Mail Archives

Date:      Wed, 19 Jul 2006 17:11:11 +0100
From:      Gareth McCaughan <gmccaughan@synaptics-uk.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: "swiN: clock sio" process taking 75% CPU
Message-ID:  <200607191711.11966.gmccaughan@synaptics-uk.com>
In-Reply-To: <200607181441.26875.jhb@freebsd.org>
References:  <200607181317.33416.gmccaughan@synaptics-uk.com> <200607181804.44813.gmccaughan@synaptics-uk.com> <200607181441.26875.jhb@freebsd.org>

On Tuesday 2006-07-18 19:41, John Baldwin wrote:

> On Tuesday 18 July 2006 13:04, Gareth McCaughan wrote:
> > On Tuesday 2006-07-18 16:54, Deomid Ryabkov wrote:
> > > Gareth McCaughan wrote:
> > > 
> > > > About 6 minutes after booting (on three occasions, but I
> > > > don't guarantee this doesn't vary), a process (well, a
> > > > kernel interrupt thread, I guess) that appears in the
> > > > output of "ps" as "[swi4: clock sio]" begins to use
> > > > about 3/4 of the machine's CPU.
...
> In that case, something is scheduling a lot of timeouts (via callout_reset() 
> or timeout()) or you have timeout handlers that are taking a very long time 
> to run.  There aren't any easy ways to debug this. :-P  You can try turning 
> on the DIGANOSTIC check in kern_timeout.c to catch long-running timeouts, and 
> you can try adding some KTR traces to softclock() to see which timeout 
> functions are running and try to do some analysis on that.

Thanks for the excellent advice!

So, I turned on DIAGNOSTIC. That produced only two messages,
both implicating what turns out to be the scrn_timer function
from syscons.c. Once it took about 10ms, once about 100ms.
There may for all I know have been an awful lot of 10ms-ish
ones, since the threshold doubles on each report. (I'll check,
but not today. If the answer is that the DIAGNOSTIC check just
happened to catch a couple of freakishly long times, then we've
found a situation where timeout functions can take too much
CPU despite no single one being bad enough to be noticed; in
that case, it might be worth enhancing the code in softclock()
a bit.)

After a quick glance at the code for scrn_timer, I tried
disabling the console screen saver (saver="NO" in rc.conf),
and lo! all is now well.

It seems to me that one of two things should be done.

1. If this is considered pilot error: Put a big warning
   somewhere saying that the screen saver makes no attempt
   to avoid eating all your CPU even when the machine is
   heavily loaded, and that it should therefore not be used
   if your machine will ever be used unattended.

2. If not: Find out why the syscons screen saver is taking
   so many cycles on my machine, and find a way to stop it.

I'd be up for putting a bit of work into #2, but if the
consensus is that I was a twit to think that I could use
a machine for real work with the screen saver enabled
then maybe #1 would do almost as well.

(The particular screen saver I turned on was the one called
"warp"; I haven't checked yet whether others have the same
CPU-guzzling effect.)

-- 
g

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200607191711.11966.gmccaughan>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation