From owner-freebsd-arch@FreeBSD.ORG Mon Apr 6 21:15:35 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B8FB981D; Mon, 6 Apr 2015 21:15:35 +0000 (UTC) Received: from st11p02mm-asmtp002.mac.com (st11p02mm-asmtp002.mac.com [17.172.220.237]) (using TLSv1.2 with cipher DHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8BE647D5; Mon, 6 Apr 2015 21:15:35 +0000 (UTC) Received: from fukuyama.hsd1.ca.comcast.net (c-73-162-13-215.hsd1.ca.comcast.net [73.162.13.215]) by st11p02mm-asmtp002.mac.com (Oracle Communications Messaging Server 7.0.5.35.0 64bit (built Dec 4 2014)) with ESMTPSA id <0NME007A8LPEBD40@st11p02mm-asmtp002.mac.com>; Mon, 06 Apr 2015 21:15:16 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.13.68,1.0.33,0.0.0000 definitions=2015-04-06_05:2015-04-06,2015-04-06,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1412110000 definitions=main-1504060195 Content-type: text/plain; charset=us-ascii MIME-version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: x86: finding interrupts that aren't being accounted for? From: Rui Paulo In-reply-to: Date: Mon, 06 Apr 2015 14:15:13 -0700 Content-transfer-encoding: quoted-printable Message-id: References: <1858440.dQ4AvDcZf7@ralph.baldwin.cx> To: Adrian Chadd X-Mailer: Apple Mail (2.2070.6) Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Apr 2015 21:15:35 -0000 > On Apr 6, 2015, at 13:38, Adrian Chadd wrote: >=20 > On 6 April 2015 at 12:18, John Baldwin wrote: >> On Monday, April 06, 2015 12:21:29 AM Adrian Chadd wrote: >>> Hi, >>>=20 >>> I have an .. odd problem on a Lenovo X230. >>>=20 >>> I just threw in a very old wifi card (Intel 3945) into the = expresscard >>> (pcie) slot. Now, we don't have any pcie-hp support in -HEAD just = yet, >>> but i wasn't expecting the system to crawl to a halt. >>>=20 >>> When I unplug it, everything returns to normal. >>>=20 >>> Other cards don't do this. >>>=20 >>> So, I figured it may be interrupt spam - but vmstat -ia shows no >>> interrupts going crazy. >>>=20 >>> pmcstat -S CPU_CLK_UNHALTED_CORE -T -w 5 doesn't register anything >>> either - only a handful of background samples. >>>=20 >>> However, /counter/ mode pmc tells a different story - pmcstat -s >>> CPU_CLK_UNHALTED_CORE -w 1 shows all four cores going at 110% when = the >>> card is inserted, with brief periods of idle. Once I remove the = card, >>> the counters go back down to zero. >>>=20 >>> My working theory is: something is chewing CPU and it's likely >>> interrupts, but if it is, it's something far, far earlier than the = x86 >>> interrupt C code, which counts interrupts and spurious events. >>>=20 >>> So - has anyone diagnosed this stuff on FreeBSD/x86 before? I was = kind >>> of hoping we'd at least get accurate statistics about spurious >>> interrupts, and if we don't, I'd like to understand why. >>>=20 >>> Thanks! >>=20 >> SMM? Perhaps SMM doesn't hide itself from PMC counters (but it can = hide itself >> from samples). >>=20 >> If it is SMM there's not really anything you can do about it. Try = getting a >> KTR_SCHED trace and looking at it in schedgraph. When I've seen SMM = isuses in >> the past it shows up as hole in the graph where nothing happens in = the system. >>=20 >> In your case you could perhaps be getting PCI errors that are = triggering the >> SMM handler. Perhaps compare pciconf -le before and after to see if = there are >> any changes. >=20 > Hm, ok. Can we extract PCIe errors yet? Yes, check pciconf. -- Rui Paulo