From owner-freebsd-current@FreeBSD.ORG  Thu May  5 11:23:02 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: Current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D62D0106564A
	for <Current@freebsd.org>; Thu,  5 May 2011 11:23:02 +0000 (UTC)
	(envelope-from Daan@vehosting.nl)
Received: from VM01.VEHosting.nl (VM016.VEHosting.nl
	[IPv6:2001:1af8:2100:b020::140])
	by mx1.freebsd.org (Postfix) with ESMTP id 0F3688FC19
	for <Current@freebsd.org>; Thu,  5 May 2011 11:23:01 +0000 (UTC)
Received: from [192.168.72.11] (124-54.bbned.dsl.internl.net [92.254.54.124])
	(authenticated bits=0)
	by VM01.VEHosting.nl (8.14.3/8.13.8) with ESMTP id p45BN2JC000744;
	Thu, 5 May 2011 13:23:02 +0200 (CEST)
	(envelope-from Daan@vehosting.nl)
From: Daan Vreeken <Daan@vehosting.nl>
Organization: http://VEHosting.nl/
To: Jack Vogel <jfvogel@gmail.com>
Date: Thu, 5 May 2011 13:22:59 +0200
User-Agent: KMail/1.9.10
References: <201105041734.50738.Daan@vehosting.nl>
	<201105050127.26358.Daan@vehosting.nl>
	<BANLkTimnr98dRRE-EPo_pP7MJO9FRUKE4g@mail.gmail.com>
In-Reply-To: <BANLkTimnr98dRRE-EPo_pP7MJO9FRUKE4g@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <201105051322.59654.Daan@vehosting.nl>
x-ve-auth-version: mi-1.1.5 2011-02-07 - Copyright (c) 2008,
	2011 - Daan Vreeken - VEHosting
x-ve-auth: authenticated as 'pa4dan' on VM01.VEHosting.nl
Cc: Current@freebsd.org
Subject: Re: Interrupt storm with MSI in combination with em1
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 05 May 2011 11:23:02 -0000

Hi Jack,

On Thursday 05 May 2011 02:25:39 Jack Vogel wrote:
> OK, but the reason you see the multiple cases of irq 16 is that's the
> bridge,
> once you are using MSIX, as vmstat shows, its using other vectors.
>
> Can you capture the messages file with the actual storm happening?

I'll do that as soon as I witness another storm. Right now the system has been 
up over half a day (with MSI/MSIX enabled) and everything seems to be working 
as it should.

> I noticed some complaints about checksums in the dmesg, have you
> checked on BIOS upgrades or something like that on your motherboard?

Not yet. I'll reboot the machine later today when I have physical access to it 
to check the BIOS version. I'll keep you informed as soon as I get another 
storm going.


> On Wed, May 4, 2011 at 4:27 PM, Daan Vreeken <Daan@vehosting.nl> wrote:
> > On Thursday 05 May 2011 00:15:43 you wrote:
> > > This all looks completely kosher,  what IRQ is the storm on??
> >
> > IRQ 16. Further down this email there is a list of devices that share the
> > IRQ
> > according to 'dmesg'.
> >
> > > On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken <Daan@vehosting.nl> wrote:
> > > > Hi,
> > > >
> > > > On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
> > > > > Will you please set it back to a default and then boot and capture
> > the
> > > > > message for me?
> > > >
> > > > No problem. Here's the output with MSI/MSIX enabled :
> > > >
> > > > http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt
> > > >
> > > > I've also added the output of "vmstat -i" a couple of minutes after a
> > > > reboot
> > > > with MSI enabled :
> > > >        http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt
> > > >
> > > > Note that in the above "vmstat -i" dump the interrupt storm hasn't
> > > > started yet. For some reason the storm doesn't always start directly
> > > > at boot. I haven't been able (yet) to pinpoint what's triggering it
> > > > to start.
> > > >
> > > > > On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken <Daan@vehosting.nl>
> >
> > wrote:
> > > > > > Hi Jack,
> > > > > >
> > > > > > Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
> > > > > > > Who makes your motherboard? The problem you are having is that
> > MSIX
> > > > > > > AND MSI are both failing as em0 comes up, so it falls back to
> > Legacy
> > > > > > > interrupt mode,
> > > > > > > and must be having some issue with sharing the line, causing
> > > > > > > the storm.
> > > > > >
> > > > > > The motherboard is an Asus "P7H55-M".
> > > > > >
> > > > > > Sorry, I should have mentioned that the dmesg output is from
> > booting
> > > > > > with :
> > > > > > > >        hw.pci.enable_msix="0"
> > > > > > > >        hw.pci.enable_msi="0"
> > > > > >
> > > > > > .. in "loader.conf".
> > > > > >
> > > > > > With those lines in "loader.conf", MSI and MSIX is disabled, both
> > > > > > cards work
> > > > > > like they should and there is no interrupt storm.
> > > > > >
> > > > > > With MSI/MSIX enabled, both cards work like they should and I see
> > the
> > > > > > counters
> > > > > > of the MSI interrupts increase (in small amounts, like they
> > should),
> > > > > > but at boot-time an interrupt storm starts on 'legacy' IRQ 16.
> > > > > >
> > > > > > Because the only difference between disabling/enabling MSI/MSIX
> > seems
> > > > > > to be in
> > > > > > the way em0/em1 are used, and because 'em1' shares IRQ 16
> > > > > > according to the dmesg, I'm suspecting 'em1' is causing the
> > > > > > storm. (But please correct me if I'm wrong :)
> > > > > >
> > > > > > What can I do to help track this problem down?
> > > > > >
> > > > > > > > According to "dmesg" the following devices share IRQ 16 :
> > > > > > > >        pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on
> > pci0
> > > > > > > >        em0: <Intel(R) PRO/1000 Network Connection 7.2.3> port
> > > > > > > > 0xcc00-0xcc1f mem
> > 0xf7de0000-0xf7dfffff,0xf7d00000-0xf7d7ffff,0xf7ddc000-0xf7ddffff
> > > > > > > >           irq 16 at device 0.0 on pci1
> > > > > > > >        vgapci0: <VGA-compatible display> port 0xbc00-0xbc07
> > > > > > > >           mem 0xf7800000-0xf7bfffff,0xe0000000-0xefffffff irq
> > 16
> > > > > > > > at device 2.0 on
> > > > > > > >           pci0
> > > > > > > >        ehci0: <Intel PCH USB 2.0 controller USB-B> mem
> > > > > > > > 0xf7cfa000-0xf7cfa3ff
> > > > > > > >           irq 16 at device 26.0 on pci0
> > > > > > > >        em1: <Intel(R) PRO/1000 Network Connection 7.2.3> port
> > > > > > > > 0xec00-0xec1f mem
> > 0xf7fe0000-0xf7ffffff,0xf7f00000-0xf7f7ffff,0xf7fdc000-0xf7fdffff
> > > > > > > >           irq 16 at device 0.0 on pci4
> > > > > > > >        pcib4: <ACPI PCI-PCI bridge> irq 16 at device 28.5 on
> > pci0
> > > > > > > > During a storm "vmstat -i" shows a rate of about 220.000
> > > > > > > > interrupts/sec.
> > > > > > > > MSI
> > > > > > > > interrupt delivery to both 'em0' and 'em1' seems to work
> > > > > > > > correctly during
> > > > > > > > a storm, as I see their counters increase normally in the
> > "vmstat
> > > > > > > > -i" output.
> > > > > > > > As only 'em0' and 'em1' seem to be using MSI interrupts, my
> > guess
> > > > > > > > is that the
> > > > > > > > e1000 driver is causing this problem. Could it be that the
> > driver
> > > > > > > > forgets to
> > > > > > > > clear/mask legacy interrupts when attaching the MSI
> > > > > > > > interrupts perhaps?
> > > > > > > >
> > > > > > > > Any tips on how to debug and/or fix this?
> > > > > > > >
> > > > > > > >
> > > > > > > > The full output of "dmesg" can be found here :
> > > > > > > >
> > > > > > > > http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
> > > > > > > >
> > > > > > > > And the full output of "pciconf -lv" is here :
> > > >
> > > > http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
> >
> > Thanks,
> > --
> > Daan Vreeken
> > VEHosting
> > http://VEHosting.nl
> > tel: +31-(0)40-7113050 / +31-(0)6-46210825
> > KvK nr: 17174380


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380