From owner-freebsd-net@freebsd.org Mon Oct 19 21:30:36 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C1F11A19AB1 for ; Mon, 19 Oct 2015 21:30:36 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9894A194B; Mon, 19 Oct 2015 21:30:36 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 5FE41B923; Mon, 19 Oct 2015 17:30:35 -0400 (EDT) From: John Baldwin To: Maxim Sobolev Cc: FreeBSD Net Subject: Re: Some MSI are not routed correctly Date: Mon, 19 Oct 2015 14:03:35 -0700 Message-ID: <1608354.LQmTMSsd5C@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-PRERELEASE; KDE/4.14.3; amd64; ; ) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 19 Oct 2015 17:30:35 -0400 (EDT) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Oct 2015 21:30:37 -0000 On Thursday, October 08, 2015 07:33:27 AM Maxim Sobolev wrote: > Hi John & others, > > We've came across a weird MSI routing issue on one of our newest dual > E5-2690v3 (haswell) Supermicro X10DRL-i boxes running latest 10.2-p4. It is > fitted with dual port Intel I350 card, in addition to the built-in I210 > chip that is not used. The hw.igb.num_queues is set to 4, and the driver > reports binding to the CPUs 0-3 for the first port and CPUs 4-7 for the > second, however when verified with top -P under the load, interrupts are > only delivered to the CPUs 0-3, no interrupt time is recorded on the CPUs > 4-7. systat -vm shows that all 8 queues are firing interrupts, so my guess > that for whatever reason bus_bind_intr() is not doing what's expected to do > for half of those interrupts. > > What's interesting is that on a similar box (same chassis/mobo/cpu) but > equipped with the quad-port X540-AT2 10Gig card, interrupts are routed > properly. The latter is running with hw.ix.num_queues="3". > > pcib2: port 0xcf8-0xcff on acpi0 > pci0: on pcib2 > pcib3: irq 26 at device 1.0 on pci0 > pci1: on pcib3 > igb0: mem > 0xc7200000-0xc72fffff,0xc7304000-0xc7307fff irq 26 at device 0.0 on pci1 > igb0: Using MSIX interrupts with 5 vectors > igb0: Ethernet address: a0:36:9f:76:af:20 > igb0: Bound queue 0 to cpu0 > igb0: Bound queue 1 to cpu1 > igb0: Bound queue 2 to cpu2 > igb0: Bound queue 3 to cpu3 > igb0: netmap queues/slots: TX 4/4096, RX 4/4096 > igb1: mem > 0xc7100000-0xc71fffff,0xc7300000-0xc7303fff irq 28 at device 0.1 on pci1 > igb1: Using MSIX interrupts with 5 vectors > igb1: Ethernet address: a0:36:9f:76:af:21 > igb1: Bound queue 0 to cpu4 > igb1: Bound queue 1 to cpu5 > igb1: Bound queue 2 to cpu6 > igb1: Bound queue 3 to cpu7 > igb1: netmap queues/slots: TX 4/4096, RX 4/4096 > > pcib2: port 0xcf8-0xcff on acpi0 > pci0: on pcib2 > pcib3: irq 26 at device 1.0 on pci0 > pci1: on pcib3 > pcib4: irq 32 at device 2.0 on pci0 > pci2: on pcib4 > pcib5: irq 40 at device 3.0 on pci0 > pci3: on pcib5 > ix0: port > 0x6020-0x603f mem 0xc7c00000-0xc7dfffff,0xc7e04000-0xc7e07fff irq 40 at > device 0.0 on pci3 > ix0: Using MSIX interrupts with 4 vectors > ix0: Bound queue 0 to cpu 0 > ix0: Bound queue 1 to cpu 1 > ix0: Bound queue 2 to cpu 2 > ix0: Ethernet address: 0c:c4:7a:5e:be:64 > ix0: PCI Express Bus: Speed 5.0GT/s Width x8 > ix0: netmap queues/slots: TX 3/4096, RX 3/4096 > ix1: port > 0x6000-0x601f mem 0xc7a00000-0xc7bfffff,0xc7e00000-0xc7e03fff irq 44 at > device 0.1 on pci3 > ix1: Using MSIX interrupts with 4 vectors > ix1: Bound queue 0 to cpu 3 > ix1: Bound queue 1 to cpu 4 > ix1: Bound queue 2 to cpu 5 > ix1: Ethernet address: 0c:c4:7a:5e:be:65 > ix1: PCI Express Bus: Speed 5.0GT/s Width x8 > ix1: netmap queues/slots: TX 3/4096, RX 3/4096 > > Some extra debug is here: > > http://sobomax.sippysoft.com/haswell_bug/bad.dmesg > http://sobomax.sippysoft.com/haswell_bug/lstopo_bad.png > http://sobomax.sippysoft.com/haswell_bug/systat_vm_bad.png > http://sobomax.sippysoft.com/haswell_bug/top_P_bad.png > > http://sobomax.sippysoft.com/haswell_bug/good.dmesg > http://sobomax.sippysoft.com/haswell_bug/lstopo_good.png > http://sobomax.sippysoft.com/haswell_bug/systat_vm_good.png > http://sobomax.sippysoft.com/haswell_bug/top_P_good.png > > Any ideas on how to debug that further are welcome. The box in the > production, but we can remove traffic during off-peak to run some > test/debug code on. Can you get procstat -S output for the interrupt threads? (Usually interrupt threads are in pid 12, so 'procstat -S 12' would suffice.) -- John Baldwin