Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 06 Apr 2015 15:18:23 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-arch@freebsd.org
Cc:        Adrian Chadd <adrian@freebsd.org>
Subject:   Re: x86: finding interrupts that aren't being accounted for?
Message-ID:  <1858440.dQ4AvDcZf7@ralph.baldwin.cx>
In-Reply-To: <CAJ-Vmok_6SK%2BuwvBsw8bqxOPSHnMbXPiJNBSjHJr3rkqFnPpXg@mail.gmail.com>
References:  <CAJ-Vmok_6SK%2BuwvBsw8bqxOPSHnMbXPiJNBSjHJr3rkqFnPpXg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, April 06, 2015 12:21:29 AM Adrian Chadd wrote:
> Hi,
> 
> I have an .. odd problem on a Lenovo X230.
> 
> I just threw in a very old wifi card (Intel 3945) into the expresscard
> (pcie) slot. Now, we don't have any pcie-hp support in -HEAD just yet,
> but i wasn't expecting the system to crawl to a halt.
> 
> When I unplug it, everything returns to normal.
> 
> Other cards don't do this.
> 
> So, I figured it may be interrupt spam - but vmstat -ia shows no
> interrupts going crazy.
> 
> pmcstat -S CPU_CLK_UNHALTED_CORE -T -w 5 doesn't register anything
> either - only a handful of background samples.
> 
> However, /counter/ mode pmc tells a different story - pmcstat -s
> CPU_CLK_UNHALTED_CORE -w 1 shows all four cores going at 110% when the
> card is inserted, with brief periods of idle. Once I remove the card,
> the counters go back down to zero.
> 
> My working theory is: something is chewing CPU and it's likely
> interrupts, but if it is, it's something far, far earlier than the x86
> interrupt C code, which counts interrupts and spurious events.
> 
> So - has anyone diagnosed this stuff on FreeBSD/x86 before? I was kind
> of hoping we'd at least get accurate statistics about spurious
> interrupts, and if we don't, I'd like to understand why.
> 
> Thanks!

SMM?  Perhaps SMM doesn't hide itself from PMC counters (but it can hide itself
from samples).

If it is SMM there's not really anything you can do about it.  Try getting a
KTR_SCHED trace and looking at it in schedgraph.  When I've seen SMM isuses in
the past it shows up as hole in the graph where nothing happens in the system.

In your case you could perhaps be getting PCI errors that are triggering the
SMM handler.  Perhaps compare pciconf -le before and after to see if there are
any changes.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1858440.dQ4AvDcZf7>