From owner-freebsd-arch@FreeBSD.ORG Mon Apr 6 21:16:24 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DA26F8D5; Mon, 6 Apr 2015 21:16:23 +0000 (UTC) Received: from mail-ig0-x22c.google.com (mail-ig0-x22c.google.com [IPv6:2607:f8b0:4001:c05::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9DA4F7E6; Mon, 6 Apr 2015 21:16:23 +0000 (UTC) Received: by igblo3 with SMTP id lo3so29701805igb.0; Mon, 06 Apr 2015 14:16:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=pho1ZANy1J6rGMxMAzjyRNmrrMOYi3VTTDFa5XE/T44=; b=EQ/ca85iWUIeT8QXiMAtyNC9lzZNnFnwWUgEz4fb7sQqgD2NN71paEqjt8Uh69Nb0I xoBg5K2nkQ0ZwpIWmrDEwtdb9Dar9M5YFgscwmuuir7h2W/zbstfAJE8yynJ3chVMJEH tVC9SrP+pRJjJ4xwDOZ9PvQFbsrEbTZ6G0jRGB6Jr5gLrUo4+djrsGbn3KxGWUxL7KvI TRDOtOZkOSZBJ5I9wa1or1UHgtNNe7JJn1Ei2D1KEZ4ELJFPvDZMbmfIu8Qd9zBRMeFN PG7HppWnVGM6VCBkMFfF9FeN4jEXeCyVwd0v7ktDdU0NAlfvowgIM05Rsib+1HqV+db7 lJEQ== MIME-Version: 1.0 X-Received: by 10.50.73.168 with SMTP id m8mr497385igv.32.1428354983142; Mon, 06 Apr 2015 14:16:23 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.36.17.194 with HTTP; Mon, 6 Apr 2015 14:16:23 -0700 (PDT) In-Reply-To: References: <1858440.dQ4AvDcZf7@ralph.baldwin.cx> Date: Mon, 6 Apr 2015 14:16:23 -0700 X-Google-Sender-Auth: dSWzM1QFujR8It4wCe39HjRMjFA Message-ID: Subject: Re: x86: finding interrupts that aren't being accounted for? From: Adrian Chadd To: Rui Paulo Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Apr 2015 21:16:24 -0000 On 6 April 2015 at 14:15, Rui Paulo wrote: > >> On Apr 6, 2015, at 13:38, Adrian Chadd wrote: >> >> On 6 April 2015 at 12:18, John Baldwin wrote: >>> On Monday, April 06, 2015 12:21:29 AM Adrian Chadd wrote: >>>> Hi, >>>> >>>> I have an .. odd problem on a Lenovo X230. >>>> >>>> I just threw in a very old wifi card (Intel 3945) into the expresscard >>>> (pcie) slot. Now, we don't have any pcie-hp support in -HEAD just yet, >>>> but i wasn't expecting the system to crawl to a halt. >>>> >>>> When I unplug it, everything returns to normal. >>>> >>>> Other cards don't do this. >>>> >>>> So, I figured it may be interrupt spam - but vmstat -ia shows no >>>> interrupts going crazy. >>>> >>>> pmcstat -S CPU_CLK_UNHALTED_CORE -T -w 5 doesn't register anything >>>> either - only a handful of background samples. >>>> >>>> However, /counter/ mode pmc tells a different story - pmcstat -s >>>> CPU_CLK_UNHALTED_CORE -w 1 shows all four cores going at 110% when the >>>> card is inserted, with brief periods of idle. Once I remove the card, >>>> the counters go back down to zero. >>>> >>>> My working theory is: something is chewing CPU and it's likely >>>> interrupts, but if it is, it's something far, far earlier than the x86 >>>> interrupt C code, which counts interrupts and spurious events. >>>> >>>> So - has anyone diagnosed this stuff on FreeBSD/x86 before? I was kind >>>> of hoping we'd at least get accurate statistics about spurious >>>> interrupts, and if we don't, I'd like to understand why. >>>> >>>> Thanks! >>> >>> SMM? Perhaps SMM doesn't hide itself from PMC counters (but it can hide itself >>> from samples). >>> >>> If it is SMM there's not really anything you can do about it. Try getting a >>> KTR_SCHED trace and looking at it in schedgraph. When I've seen SMM isuses in >>> the past it shows up as hole in the graph where nothing happens in the system. >>> >>> In your case you could perhaps be getting PCI errors that are triggering the >>> SMM handler. Perhaps compare pciconf -le before and after to see if there are >>> any changes. >> >> Hm, ok. Can we extract PCIe errors yet? > > Yes, check pciconf. I'll try, but the system is pretty unusable whilst the card is plugged in... Thanks! -a