From owner-freebsd-current@FreeBSD.ORG Mon Mar 2 18:31:32 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B54071065696; Mon, 2 Mar 2009 18:31:32 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 760C08FC18; Mon, 2 Mar 2009 18:31:32 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (pool-98-109-39-197.nwrknj.fios.verizon.net [98.109.39.197]) by cyrus.watson.org (Postfix) with ESMTPSA id 0580446B3C; Mon, 2 Mar 2009 13:31:31 -0500 (EST) Received: from localhost (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id n22IVNXb078710; Mon, 2 Mar 2009 13:31:23 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Robert Noland Date: Mon, 2 Mar 2009 11:15:20 -0500 User-Agent: KMail/1.9.7 References: <200902271730.07660.snasonov@bcc.ru> <200902271502.37051.jhb@freebsd.org> <1235784959.1289.67.camel@widget.2hip.net> In-Reply-To: <1235784959.1289.67.camel@widget.2hip.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200903021115.21246.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Mon, 02 Mar 2009 13:31:26 -0500 (EST) X-Virus-Scanned: ClamAV 0.94.2/9061/Mon Mar 2 04:28:18 2009 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-current@freebsd.org, Sergey G Nasonov Subject: Re: Interrupt stom on cardbus device X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2009 18:31:35 -0000 On Friday 27 February 2009 8:35:59 pm Robert Noland wrote: > On Fri, 2009-02-27 at 15:02 -0500, John Baldwin wrote: > > On Friday 27 February 2009 2:11:04 pm Robert Noland wrote: > > > On Fri, 2009-02-27 at 14:03 -0500, John Baldwin wrote: > > > > On Friday 27 February 2009 1:50:28 pm Robert Noland wrote: > > > > > On Fri, 2009-02-27 at 12:08 -0500, John Baldwin wrote: > > > > > > On Friday 27 February 2009 9:30:06 am Sergey G Nasonov wrote: > > > > > > > Hello all, > > > > > > > I have get an issue after recent kernel recompile. > > > > > > > The problem appears after switch from X to text console and back to X11. > > > > > > > After that vmstat -i show an interrupt storm on cardbus device: > > > > > > > > > > > > > > > vmstat -i > > > > > > > interrupt total rate > > > > > > > irq1: atkbd0 6483 3 > > > > > > > irq9: acpi0 3236 1 > > > > > > > irq12: psm0 347988 167 > > > > > > > irq14: ata0 16431 7 > > > > > > > irq16: cbb0 uhci2+ 13624982 6556 > > > > > > > irq20: uhci0 14 0 > > > > > > > irq22: ehci0 2 0 > > > > > > > cpu0: timer 4154687 1999 > > > > > > > irq256: em0 53736 25 > > > > > > > irq257: hdac0 5797 2 > > > > > > > cpu1: timer 4153683 1998 > > > > > > > irq258: vgapci0 235585 113 > > > > > > > Total 22602624 10877 > > > > > > > > > > > > > > I suppose that the issue related with the latest MSI interrupt > > > > > > > handler changes for intel graphics chipset. My laptop has i965GM. > > > > > > > pciconf -lv: > > > > > > > > > > > > > > vgapci0@pci0:0:2:0: class=0x030000 card=0x20b517aa chip=0x2a028086 > > > > > > > rev=0x0c hdr=0x00 > > > > > > > vendor = 'Intel Corporation' > > > > > > > device = 'Mobile 965 Express Integrated Graphics Controller' > > > > > > > class = display > > > > > > > subclass = VGA > > > > > > > > > > > > > > When I added my device to drm_msi_blacklist and recompile drm modules > > > > the > > > > > > > problem disappear. > > > > > > > Is it possible to resolve this problem without moving the device to the > > > > > > > drm_msi_blacklist? > > > > > > > I can test any patches or provide additional detail if it is required. > > > > > > > Thanks. > > > > > > > > > > > > It seems the device is still interrupting on its INTx line perhaps in > > > > addition > > > > > > to the MSI interrupts. > > > > > > > > > > Hrm, I did most all of that development on a 965gm. When you VT switch, > > > > > the irq handler gets uninstalled and reinstalled when you return to X. > > > > > There was an eratta on the 965gm suggesting that msi didn't work right, > > > > > but I was never able to produce the issue. Intel was having major > > > > > issues with this on linux and I finally convinced them to turn msi back > > > > > on. My irq handler and Eric's are very similar, so I'm not sure what > > > > > could be going on here. > > > > > > > > > > There is however an issue with vblanks that might be related. Could you > > > > > try http://people.freebsd.org/~rnoland/drm-move_vblank_init.patch and > > > > > see if that helps? > > > > > > > > In this case the issue isn't that MSI isn't working I think, but that the > > > > hardware is sending interrupts via both routes (MSI and INTx). If that > > > > happens, then you will see an interrupt storm on the INTx line, but FreeBSD > > > > will only notice if another device is sharing the same IRQ line. So if your > > > > test machine has vgapci0 on irq 22 and you have no other devices on IRQ 22, > > > > then the storm would go unnoticed. This is most likely a chip bug (unless > > > > the driver has to explicitly disable INTx interrupts when using MSI). It > > > > would probably be a good idea to add a hw.drm.msi_enable tunable (or > > > > hw.drm.msi) that people can use to disable MSI perhaps. > > > > > > Ok, I do have docs on the 965, so I'll look at this. The linux version > > > does not do this, unless the OS does it in the background somewhere. > > Ok, so I looked over the 965 docs again and noticed PCIR_COMMAND bit 10. > Then I pulled up the AMD docs on their PCIE cards and they also have > this bit. I made an test patch for just the i915 driver to ensure that > this fixes the issue, but it seems like a more general fix is in order. > I'm proposing to disable INTx when we setup MSI/MSIX interrupts. I > talked with scottl@ about this a bit last night and this seems like the > right thing to do, or at least it shouldn't hurt much... > > John, what do you think of the attached patch? Looks good and is something that was on my low-priority todo list. :) -- John Baldwin