From owner-freebsd-net@FreeBSD.ORG Mon Mar 4 16:41:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E9FBA199 for ; Mon, 4 Mar 2013 16:41:57 +0000 (UTC) (envelope-from ncrogers@gmail.com) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id A6CA276F for ; Mon, 4 Mar 2013 16:41:57 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fl17so3540197vcb.27 for ; Mon, 04 Mar 2013 08:41:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=mLMCDuQOvB/E4N3J5JAoWAtZESzDmQ0eMdCy8LDaTpk=; b=Ai+dsVLGM4sadpofEti2R+umDi6eRcRvbOLpcpZ3r2ZAbR6PX5NeRfoTdiNql6XQtS 7qt6aX+isQwXsSl5J8d0ZGx9jfChAZ0nAkllsUHQlHWoHBcPaZfkxjQ1JFGx6xyjCdI3 9L0IY7K9ayEwkrcP5BzGp7Z9BVKZ5RcMcAMooIhmb7U2yzZxourWzLHqXfCC4z8KfImK U/Vca9h7NVTDIfbypayo/k68tVVyl8/vGcPylZnoFhyYZ8xhMWb3spe3CAwdJAqrI/SO sTGPe/rc3BjAt4OKkvXio921IxTn6XzIhHvZXqMD4am2d1RJLpEofc6+JToJQNhdqyns uXVg== MIME-Version: 1.0 X-Received: by 10.220.219.73 with SMTP id ht9mr7873390vcb.47.1362415311147; Mon, 04 Mar 2013 08:41:51 -0800 (PST) Received: by 10.52.176.131 with HTTP; Mon, 4 Mar 2013 08:41:51 -0800 (PST) In-Reply-To: References: <512BAA60.3060703@biostat.wisc.edu> <512BAF8D.7080308@biostat.wisc.edu> Date: Mon, 4 Mar 2013 08:41:51 -0800 Message-ID: Subject: Re: igb network lockups From: Nick Rogers To: Sepherosa Ziehau Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-net@freebsd.org" , Jack Vogel , "Christopher D. Harrison" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Mar 2013 16:41:58 -0000 On Sun, Mar 3, 2013 at 2:14 AM, Sepherosa Ziehau wrote: > On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers wrote: >> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers wrote: >>> FWIW I have been experiencing a similar issue on a number of systems >>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from >>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms >>> are: interface stops passing traffic until the system is rebooted. I >>> have not yet been able to gain access to the systems to dig around >>> (after they have crashed), however my kernel/network settings are >>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to >>> happen about once a day on systems with around a sustained 50Mb/s of >>> traffic. >>> >>> I realize this is not much to go on but perhaps it helps. I am >>> debating trying the e1000 driver in the latest CURRENT on top of >>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week >>> ago. Would this change or perhaps another change to e1000 since >>> 9.1-RELEASE possibly affect stability in a positive way? >>> >>> Thanks. >> >> Heres relevant pciconf output: >> >> em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected > > For 82574L, i.e. supported by em(4), MSI-X must _not_ be enabled; it > is simply broken (you could check 82574 errata on Intel's website to > confirm what I have said here). Thanks. So on FreeBSD 9.1-RELEASE it is advisable to set hw.em.enable_msix=0 for 82574L? Are there other em(x) NICs where this is advisable? > > For 82575, i.e. supported by igb(4), MSI-X must _not_ be enabled; it > is simply broken (you could check 82575 errata on Intel's website to > confirm what I have said here). > > Best Regards, > sephe > > -- > Tomorrow Will Never Die > > On Sat, Mar 2, 2013 at 12:18 AM, Nick Rogers wrote: >> On Fri, Mar 1, 2013 at 8:04 AM, Nick Rogers wrote: >>> FWIW I have been experiencing a similar issue on a number of systems >>> using the em(4) driver under 9.1-RELEASE. This is after upgrading from >>> a snapshot of 8.3-STABLE. My systems use PF+ALTQ as well. The symptoms >>> are: interface stops passing traffic until the system is rebooted. I >>> have not yet been able to gain access to the systems to dig around >>> (after they have crashed), however my kernel/network settings are >>> properly tuned (high mbuf limit, hw.em.rxd/txd=4096, etc). It seems to >>> happen about once a day on systems with around a sustained 50Mb/s of >>> traffic. >>> >>> I realize this is not much to go on but perhaps it helps. I am >>> debating trying the e1000 driver in the latest CURRENT on top of >>> 9.1-RELEASE. I noticed the Intel shared code was updated about a week >>> ago. Would this change or perhaps another change to e1000 since >>> 9.1-RELEASE possibly affect stability in a positive way? >>> >>> Thanks. >> >> Heres relevant pciconf output: >> >> em0@pci0:1:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> em1@pci0:2:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> em2@pci0:7:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> em3@pci0:8:0:0: class=0x020000 card=0x10d315d9 chip=0x10d38086 rev=0x00 hdr=0x00 >> vendor = 'Intel Corporation' >> device = '82574L Gigabit Network Connection' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> >> >>> >>> On Mon, Feb 25, 2013 at 10:45 AM, Jack Vogel wrote: >>>> Have you done any poking around, looking at stats to determine why the >>>> hangs? For instance, >>>> might your mbuf pool be depleted? Some other network resource perhaps? >>>> >>>> Jack >>>> >>>> >>>> On Mon, Feb 25, 2013 at 10:38 AM, Christopher D. Harrison < >>>> harrison@biostat.wisc.edu> wrote: >>>> >>>>> Sure, >>>>> The problem appears on both systems running with ALTQ and vanilla. >>>>> -C >>>>> >>>>> On 02/25/13 12:29, Jack Vogel wrote: >>>>> >>>>> I've not heard of this problem, but I think most users do not use ALTQ, >>>>> and we (Intel) do not >>>>> test using it. Can it be eliminated from the equation? >>>>> >>>>> Jack >>>>> >>>>> >>>>> On Mon, Feb 25, 2013 at 10:16 AM, Christopher D. Harrison < >>>>> harrison@biostat.wisc.edu> wrote: >>>>> >>>>>> I recently have been experiencing network "freezes" and network "lockups" >>>>>> on our Freebsd 9.1 systems which are running zfs and nfs file servers. >>>>>> I upgraded from 9.0 to 9.1 about 2 months ago and we have been having >>>>>> issues with almost bi-monthly. The issue manifests in the system becomes >>>>>> unresponsive to any/all nfs clients. The system is not resource bound as >>>>>> our I/O is low to disk and our network is usually in the 20mbit/40mbit >>>>>> range. We do notice a correlation between temporary i/o spikes and >>>>>> network freezes but not enough to send our system in to "lockup" mode for >>>>>> the next 5min. Currently we have 4 igb nics in 2 aggr's with 8 queue's >>>>>> per nic and our dev.igb reports: >>>>>> >>>>>> dev.igb.3.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.4 >>>>>> >>>>>> I am almost certain the problem is with the ibg driver as a friend is >>>>>> also experiencing the same problem with the same intel igb nic. He has >>>>>> addressed the issue by restarting the network using netif on his systems. >>>>>> According to my friend, once the network interfaces get cleared, everything >>>>>> comes back and starts working as expected. >>>>>> >>>>>> I have noticed an issue with the igb driver and I was looking for >>>>>> thoughts on how to help address this problem. >>>>>> >>>>>> http://freebsd.1045724.n5.nabble.com/em-igb-if-transmit-drbr-and-ALTQ-td5760338.html >>>>>> >>>>>> Thoughts/Ideas are greatly appreciated!!! >>>>>> >>>>>> -C >>>>>> >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > > -- > Tomorrow Will Never Die