Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Jul 2014 02:05:00 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        FreeBSD Stable Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: Consistently "high" CPU load on 10.0-STABLE
Message-ID:  <CAJ-Vmok2ktaCwtqt1U6HjZZ5bUZ5CEUXnm-wOECtHHo8Xxx98A@mail.gmail.com>
In-Reply-To: <20140720085438.GA59359@icarus.home.lan>
References:  <20140720062413.GA56318@icarus.home.lan> <CAJ-VmokyYW3Rwxj40nyaOWgVxwBvCGU_4KVY-aZa6GPomYtX0g@mail.gmail.com> <20140720085438.GA59359@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

You can zoom in to see the specific details for each event that's
being scheduled. The idle/scheduled/running events are differently
coloured.

Look at the periods of your CPUs having >1 task scheduled and see
what's currently being scheduled. Chances are there's a couple of
cascading kernel threads being woken up for something and it's
temporarily raising the load av to 2 for short durations.

I believe there were some changes in the loadav calculation between 9
and 10. This will let you see exactly what's going on.



-a


On 20 July 2014 01:54, Jeremy Chadwick <jdc@koitsu.org> wrote:
>
> Okay, so I had to install FreeBSD on a VM (on a different box) to deal
> with Xorg and all that jazz, then did the following on the bare-metal
> system:
>
> - Let it sit idle (barring cronjobs and existing daemons) for about 10
>   minutes
> - sysctl debug.ktr.mask=0
> - Waited 10-15 seconds
> - ktrdump -ctq > ktr.out
> - sysctl debug.ktr.mask=0x20000000  (what the value was originally)
> - scp'd ktr.out to the VM box
> - On VM in X: python /usr/src/tools/sched/schedgraph.py ktr.out 2.8
>
> What I end up with is an application that is a bit difficult for me to
> grasp.  It seems to indicate spanning a time frame of 256.47 seconds,
> and KTR had collected 247735 events from a total of 83 sources.  I can,
> of course, scroll through all those sources (vertically) but I'm not
> really sure what it is that's being graphed on a per-event basis.
>
> What the graphs (individually) represent on a vertical scale I'm not
> sure.  Horizontally they seem to be "sectionalised" in some manner, like
> into blocks (possibly of time?) but I can't tell if that's "how long
> something was running for" and if that actually correlates with CPU load
> or not.
>
> This is very hard to explain in text, quite honestly, and I can't find a
> single example of how to use this tool anywhere online.  rwatson's site
> has some vague information (that also seems outdated, particularly
> shoving the ktrdump output through sort -n).
>
> I've put a screenshot up of the relevant window, specifically the CPU n
> load parts.  Part of me wonders if the repeated "spikes" (especially on
> CPU 0?) are indicators of what I'm experiencing, but I really don't
> know:
>
> http://jdc.koitsu.org/freebsd/releng10_perf_issue/sched01.png
>
> I'm happy to provide the ktr.out if it's of any use, by the way.  My KTR
> kernel configuration settings on the bare metal box are:
>
> options         KTR
> options         KTR_ENTRIES=262144
> options         KTR_COMPILE=(KTR_SCHED)
> options         KTR_MASK=(KTR_SCHED)
>
> Which based on reading the python script vs. what's on the web seem to
> be what's generally desired.
>
> I suppose I can try doing things like shutting off all the daemons I
> normally run and then see if the problem goes away (and if so try to
> track it down to a single daemon), but like I said I don't really see
> anything userland-process-wise suddenly start taking up CPU time.
> Whatever it is is "heavy" enough to cause the load to go from 0.03 to
> 0.24 or so within a few seconds and then settle down again.  Part of me
> wonders if it's ZFS (periodic txg flushing or something).
>
> Oh, one thing I did find manually: top -S -H -b 999999 on two different
> boxes: the "swapper" thread seems interesting and I'll explain why:
>
> RELENG_10 box:
>
>  1:49AM  up  5:52, 1 user, load averages: 0.32, 0.16, 0.10
>     0 root       -16    0     0K  4912K swapin  0   1:04   0.00% [kernel{swapper}]
>
> RELENG_9 box:
>
>  1:49AM  up 39 days,  8:19, 1 user, load averages: 0.04, 0.06, 0.02
>     0 root       -16    0     0K   160K swapin  0   0:55   0.00% [kernel{swapper}]
>
> I don't know what the "swapper" thread is for or doing (I assume not
> related to pagedaemon?), but I find it interesting that the RELENG_10
> system has already used 1:04 worth of system time (I think that's 1
> minute 4 seconds worth?) for a system that's only been up 5 hours and
> not using any swap, while a different RELENG_9 box that's been up for 39
> days and doing a lot more I/O (receiving mail, logging things, running a
> public web server with logs, etc.) has only used 0:55 (and is actually
> using 104MBytes of swap).  TL;DR -- I wonder if it's "swapper" that's
> doing it.
>
> In the schedgraph, swapper shows up in little chunks of 10 second
> durations, but that may be normal.  :/
>
> On Sun, Jul 20, 2014 at 12:13:05AM -0700, Adrian Chadd wrote:
>> Hi,
>>
>> I don't know how to do this with dtrace, but take a look at
>> tools/sched/schedgraph.py and enable KTR to get some trace records.
>>
>> KTR logs the scheduler activity -and- the loadav with it.
>>
>>
>> -a
>>
>>
>> On 19 July 2014 23:24, Jeremy Chadwick <jdc@koitsu.org> wrote:
>> > (Please keep me CC'd as I'm not subscribed to freebsd-stable@)
>> >
>> > Today I took the liberty of upgrading my main home server from
>> > 9.3-STABLE (r268785) to 10.0-STABLE (r268894).  The upgrade consisted of
>> > doing a fresh install of 10.0-STABLE on a brand new unused SSD.  Most
>> > everything went as planned, barring a couple ports-related anomalies,
>> > and I seemed fairly impressed by the fact that buildworld times had
>> > dropped to 27 minutes and buildkernel to 4 minutes with clang (something
>> > I'd been avoiding like the plague for a long while).  Kudos.
>> >
>> > But after an hour or so, I noticed a consistent (i.e. reproducible)
>> > trend: the system load average tends to hang around 0.10 to 0.15 all the
>> > time.  There are times where the load drops to 0.03 or 0.04 but then
>> > something kicks it back up to 0.15 or 0.20 and then it slowly levels out
>> > again (over the course of a few minutes) then repeats.
>> >
>> > Obviously this is normal behaviour for a system when something is going
>> > on periodically.  So I figured it might have been a userland process
>> > behaving differently under 10.x than 9.x.  I let top -a -S -s 1 run and
>> > paid very very close attention to it for several minutes.  Nothing.  It
>> > doesn't appear to be something userland -- it appears to be something
>> > kernel-level, but nothing in top -S shows up as taking up any CPU time
>> > other than "[idle]" so I have no idea what might be doing it.
>> >
>> > The box isn't doing anything like routing network traffic/NAT, it's pure
>> > IPv4 (IPv6 disabled in world and kernel, and my home network does
>> > basically no IPv6) and sits idle most of the time fetching mail.  It
>> > does use ZFS, but not for /, swap, /var, /tmp, or /usr.
>> >
>> > vmstat -i doesn't particularly show anything awful.  All the cpuX:timer
>> > entries tend to fluctuate in rate, usually 120-200 or so; I'd expect an
>> > interrupt storm to be showing something in the 1000+ range.
>> >
>> > The only thing I can think of is the fact that the SSD being used has no
>> > 4K quirk entry in the kernel (and its ATA IDENTIFY responds with 512
>> > logical, 512 physical, even though we know it's 4K).  The partitions are
>> > all 1MB-aligned regardless.
>> >
>> > This is all bare-metal, by the way -- no virtualisation involved.
>> >
>> > I do have DTrace enabled/built on this box but I have absolutely no clue
>> > how to go about profiling things.  For example maybe output of this sort
>> > would be helpful (but I've no idea how to get it):
>> >
>> > http://lists.freebsd.org/pipermail/freebsd-stable/2014-July/079276.html
>> >
>> > I'm certain I didn't see this behaviour in 9.x so I'd be happy to try
>> > and track it down if I had a little bit of hand-holding.
>> >
>> > I've put all the things I can think of that might be relevant to "system
>> > config/tuning bits" up here:
>> >
>> > http://jdc.koitsu.org/freebsd/releng10_perf_issue/
>> >
>> > I should note my kernel config is slightly inaccurate (I've removed some
>> > stuff from the file in attempt to rebuild, but building world prior to
>> > kernel failed due to r268896 breaking world, but anyone subscribed here
>> > has already seen the Jenkins job of that ;-) ).
>> >
>> > Thanks.
>> >
>> > --
>> > | Jeremy Chadwick                                   jdc@koitsu.org |
>> > | UNIX Systems Administrator                http://jdc.koitsu.org/ |
>> > | Making life hard for others since 1977.             PGP 4BD6C0CB |
>> >
>> > _______________________________________________
>> > freebsd-stable@freebsd.org mailing list
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>> _______________________________________________
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>
> --
> | Jeremy Chadwick                                   jdc@koitsu.org |
> | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> | Making life hard for others since 1977.             PGP 4BD6C0CB |
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmok2ktaCwtqt1U6HjZZ5bUZ5CEUXnm-wOECtHHo8Xxx98A>