Date: Mon, 06 Apr 2015 14:15:13 -0700 From: Rui Paulo <rpaulo@me.com> To: Adrian Chadd <adrian@FreeBSD.org> Cc: "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org> Subject: Re: x86: finding interrupts that aren't being accounted for? Message-ID: <CB014B57-0D75-4ED7-A7EF-871227C3121C@me.com> In-Reply-To: <CAJ-VmonnQKHYaP4aAxbzRGxV3tZF8JVH2FTMp5jehEX2Huvp_g@mail.gmail.com> References: <CAJ-Vmok_6SK%2BuwvBsw8bqxOPSHnMbXPiJNBSjHJr3rkqFnPpXg@mail.gmail.com> <1858440.dQ4AvDcZf7@ralph.baldwin.cx> <CAJ-VmonnQKHYaP4aAxbzRGxV3tZF8JVH2FTMp5jehEX2Huvp_g@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Apr 6, 2015, at 13:38, Adrian Chadd <adrian@FreeBSD.org> wrote: >=20 > On 6 April 2015 at 12:18, John Baldwin <jhb@freebsd.org> wrote: >> On Monday, April 06, 2015 12:21:29 AM Adrian Chadd wrote: >>> Hi, >>>=20 >>> I have an .. odd problem on a Lenovo X230. >>>=20 >>> I just threw in a very old wifi card (Intel 3945) into the = expresscard >>> (pcie) slot. Now, we don't have any pcie-hp support in -HEAD just = yet, >>> but i wasn't expecting the system to crawl to a halt. >>>=20 >>> When I unplug it, everything returns to normal. >>>=20 >>> Other cards don't do this. >>>=20 >>> So, I figured it may be interrupt spam - but vmstat -ia shows no >>> interrupts going crazy. >>>=20 >>> pmcstat -S CPU_CLK_UNHALTED_CORE -T -w 5 doesn't register anything >>> either - only a handful of background samples. >>>=20 >>> However, /counter/ mode pmc tells a different story - pmcstat -s >>> CPU_CLK_UNHALTED_CORE -w 1 shows all four cores going at 110% when = the >>> card is inserted, with brief periods of idle. Once I remove the = card, >>> the counters go back down to zero. >>>=20 >>> My working theory is: something is chewing CPU and it's likely >>> interrupts, but if it is, it's something far, far earlier than the = x86 >>> interrupt C code, which counts interrupts and spurious events. >>>=20 >>> So - has anyone diagnosed this stuff on FreeBSD/x86 before? I was = kind >>> of hoping we'd at least get accurate statistics about spurious >>> interrupts, and if we don't, I'd like to understand why. >>>=20 >>> Thanks! >>=20 >> SMM? Perhaps SMM doesn't hide itself from PMC counters (but it can = hide itself >> from samples). >>=20 >> If it is SMM there's not really anything you can do about it. Try = getting a >> KTR_SCHED trace and looking at it in schedgraph. When I've seen SMM = isuses in >> the past it shows up as hole in the graph where nothing happens in = the system. >>=20 >> In your case you could perhaps be getting PCI errors that are = triggering the >> SMM handler. Perhaps compare pciconf -le before and after to see if = there are >> any changes. >=20 > Hm, ok. Can we extract PCIe errors yet? Yes, check pciconf. -- Rui Paulo
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CB014B57-0D75-4ED7-A7EF-871227C3121C>