From owner-freebsd-stable@FreeBSD.ORG  Sun Jul 20 22:58:48 2014
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 53DFDA1E
 for <freebsd-stable@freebsd.org>; Sun, 20 Jul 2014 22:58:48 +0000 (UTC)
Received: from qmta07.emeryville.ca.mail.comcast.net
 (qmta07.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:64])
 by mx1.freebsd.org (Postfix) with ESMTP id 3018F2614
 for <freebsd-stable@freebsd.org>; Sun, 20 Jul 2014 22:58:47 +0000 (UTC)
Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71])
 by qmta07.emeryville.ca.mail.comcast.net with comcast
 id Umw91o0021Y3wxoA7mymdq; Sun, 20 Jul 2014 22:58:46 +0000
Received: from jdc.koitsu.org ([69.181.136.108])
 by omta15.emeryville.ca.mail.comcast.net with comcast
 id Umyl1o00S2LW5AV8bmymts; Sun, 20 Jul 2014 22:58:46 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
 id 96DEF1744B1; Sun, 20 Jul 2014 15:58:45 -0700 (PDT)
Date: Sun, 20 Jul 2014 15:58:45 -0700
From: Jeremy Chadwick <jdc@koitsu.org>
To: Adrian Chadd <adrian@freebsd.org>
Subject: Re: Consistently "high" CPU load on 10.0-STABLE
Message-ID: <20140720225845.GA81033@icarus.home.lan>
References: <20140720062413.GA56318@icarus.home.lan>
 <97EA8E571E634DBBAA70F7AA7F0DE97C@multiplay.co.uk>
 <20140720173524.GA67065@icarus.home.lan>
 <ED826825202341E58B71A3F718B60562@multiplay.co.uk>
 <20140720201655.GA70545@icarus.home.lan>
 <CAJ-Vmo=O-OH-Ljk1u-SGF1L=jj=Lnwd_aQs0Bqc5Mi03jZvkuw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAJ-Vmo=O-OH-Ljk1u-SGF1L=jj=Lnwd_aQs0Bqc5Mi03jZvkuw@mail.gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
 s=q20140121; t=1405897126;
 bh=6OvcBrZcTuLSSnoXk/Sy2U/0vv/kTmsDG2VLrwkxF0g=;
 h=Received:Received:Received:Date:From:To:Subject:Message-ID:
 MIME-Version:Content-Type;
 b=an+V8eNWyPO6iN7R7cxlsHO50DnP3psw7nDqa46A6PC5Ae9cR6UlBIt4nry4JhnRV
 KNziyp/sQkgygGdYGTxVMST0aeY6Kzj7AD0coZUVOJ8xb2jg1igEPQFaCAh6uQ8NBw
 beBr749VIQcN9wJNACErEPA4b2gZ0UaIABp9JcFhEF3d/ozXHyHTmeputTfuQWKzZB
 UZSKOg4EbUb3JcZImTMZxKZ/i4i8iboCMir47TN0Z+nS9GeRlXdQbDI7gv9PHvipKt
 Mj8S9ouvKsFOoUgOTklgUcQFjipPHg6LYWVW1HRP0BR9BBq39hmOPPaXB7gIP5kePG
 cqOVQmRU115ZA==
Cc: Steven Hartland <killing@multiplay.co.uk>,
 FreeBSD Stable Mailing List <freebsd-stable@freebsd.org>
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 20 Jul 2014 22:58:48 -0000

On Sun, Jul 20, 2014 at 03:09:55PM -0700, Adrian Chadd wrote:
> hi,
> 
> it looks like a whole lot of things are waking up at the same time:
> 
> * dhcpd
> * em
> * usb devices
> 
> So, do you have some shared interrupts going on here? That seems to be
> what's causing things to all wake up all at once.

I forget how to get an interrupt mapping from the I/O APIC, but dmesg
indicates the following.  Sorted by IRQ order so that you can tell
what's associated with what, and also RELENG_9 vs. RELENG_10 (because I
do have an old dmesg.today from this box running RELENG_9).  All the
IRQs match up:

dev       RELENG_10      RELENG_9
--------  -------------  -------------
ioapic0   IRQs  0 to 23  IRQs  0 to 23 (same)
ioapic1   IRQs 24 to 47  IRQs 24 to 47 (same)
attimer0  IRQ 0          IRQ 0 (same)
atkbdc0   IRQ 1          IRQ 1 (same)
atkbd0    IRQ 1          IRQ 1 (same)
uart1     IRQ 3          IRQ 3 (same)
uart0     IRQ 4          IRQ 4 (same)
atrtc0    IRQ 8          IRQ 8 (same)
em0       IRQ 16         IRQ 16 (same)
pcib1     IRQ 16         IRQ 16 (same)
pcib3     IRQ 16         IRQ 16 (same)
pcib4     IRQ 16         IRQ 16 (same)
uhci0     IRQ 16         IRQ 16 (same)
ahci0     IRQ 17         IRQ 17 (same)
em1       IRQ 17         IRQ 17 (same)
ichsmb0   IRQ 17         IRQ 17 (same)
pcib5     IRQ 17         IRQ 17 (same)
uhci1     IRQ 17         IRQ 17 (same)
ehci0     IRQ 18         IRQ 18 (same)
uhci2     IRQ 18         IRQ 18 (same)
uhci5     IRQ 18         IRQ 18 (same)
siis0     IRQ 21         IRQ 21 (same)
uhci4     IRQ 22         IRQ 22 (same)
ehci1     IRQ 23         IRQ 23 (same)
uhci3     IRQ 23         IRQ 23 (same)

And the higher-numbered IRQs per vmstat -i.  I only have this for
RELENG_10 however:

irq256: em0                      1848856         26
irq259: ahci0:ch0                 273086          3
irq260: ahci0:ch1                   9990          0
irq261: ahci0:ch2                  48514          0
irq262: ahci0:ch3                  48046          0
irq263: ahci0:ch4                  48258          0
irq264: ahci0:ch5                  48052          0

vmstat -i for this is kinda painful (discussed this with jhb@ in the
past, re: kernel just appending "+" to the string to indicate "many
things using this IRQ").

I have absolute no USB devices attached to the system (meaning there are
USB controllers and ports, yeah, but nothing attached to any of them).
The keyboard is PS/2.  All disks are on ahci0 (no disks currently
attached to siis0).

As for dhcpd: I don't know how that'd be responsible.  If I stop the
process entirely I still see the problem.

I can provide some more ktrdumps, along with turning off as many daemons
+ cron jobs as I can, if you feel that'd be helpful.

Likewise I can provide an ACPI DSDT dump if that would be useful (maybe
to someone else).

I haven't tried booting the box in single-user and letting it sit there
to see if anything shows up there.

In the interim I wrote the perl script I mentioned in my mail to Steve.
When the load shoots up, there is literally no field in "vmstat -s"
that shows a humongous increase (or decrease) consistently.  Meaning
I'd say 95% of the time when there's a sudden load jump, none of those
statistics I can correlate with it.  It's a pretty "meh" script, but
it does the job of showing deltas between vmstat -s runs and indicating
visually when there's a jump in load average (1m avg).  It requires a
VERY wide terminal (about 301 characters):

http://jdc.koitsu.org/freebsd/releng10_perf_issue/load_vmstat.pl

Some example output is here (obviously can't see the red+bold
highlighting of the line):

http://jdc.koitsu.org/freebsd/releng10_perf_issue/example_data.txt

Load jumps at the following time indexes:

124.0  (from 0.02 to 0.10, load delta: 0.08)
153.0  (from 0.06 to 0.14, load delta: 0.08, time delta: 29.0 sec)
178.5  (from 0.10 to 0.17, load delta: 0.07, time delta: 25.5 sec)
217.0  (from 0.09 to 0.17, load delta: 0.08, time delta: 38.5 sec)
236.0  (from 0.12 to 0.19, load delta: 0.07, time delta: 19.0 sec)
244.0  (from 0.17 to 0.24, load delta: 0.07, time delta:  8.0 sec)
259.0  (from 0.20 to 0.27, load delta: 0.07, time delta: 15.0 sec)
284.5  (from 0.19 to 0.25, load delta: 0.06, time delta: 25.5 sec)
310.0  (from 0.18 to 0.25, load delta: 0.07, time delta: 25.5 sec)
341.5  (from 0.27 to 0.33, load delta: 0.06, time delta: 31.5 sec)

Some of these could be due to cron jobs I run (though they really aren't
that intensive on disk, CPU, or memory), but there's a pretty consistent
pattern going on there load-wise.  The reason noted time deltas was
watching for "periodic tasks", e.g. ZFS txg flush.  But this seems to
have a little bit more variance.

It's just that none of the vmstat -s statistics change rapidly alongside
the load.  But I'm sure there are VM bits that aren't tracked in vmstat.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |