From owner-freebsd-net@FreeBSD.ORG Fri Oct 7 18:59:16 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 549D6106564A for ; Fri, 7 Oct 2011 18:59:16 +0000 (UTC) (envelope-from nitroboost@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0A37E8FC12 for ; Fri, 7 Oct 2011 18:59:15 +0000 (UTC) Received: by ywp17 with SMTP id 17so4939029ywp.13 for ; Fri, 07 Oct 2011 11:59:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=XYnP3i3cQgR+DF5PtWct4pbrc5K5qjI8/UNSf25pp50=; b=o/w4bct5f74ghDIyfImf6KFFslVh1Jym/wUPPSGazAqQ/CJPrDpTjte5vwr8c7CLC3 cofXPUHGnV8c1nT753aMhLbW+0A28ObwzH9VPmws2Pfz5nwsbQpvZBFTFUqJri1XUdWK c6Wf2Zf6w3hOKbKeRzYDzI3rgiGr/+b0noqtk= MIME-Version: 1.0 Received: by 10.223.16.131 with SMTP id o3mr12469599faa.11.1318013954579; Fri, 07 Oct 2011 11:59:14 -0700 (PDT) Received: by 10.152.36.102 with HTTP; Fri, 7 Oct 2011 11:59:14 -0700 (PDT) In-Reply-To: <4E8F157A.40702@sentex.net> References: <4E8F157A.40702@sentex.net> Date: Fri, 7 Oct 2011 11:59:14 -0700 Message-ID: From: Jason Wolfe To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Oct 2011 18:59:16 -0000 Mike, I had a large pool of servers running 7.2.3 with MSI-X enabled during my testing, but it didn't resolve the issue. I just pulled back the sys/dev/e1000 directory from 8-STABLE and ran it on 8-RELEASE-p2 though, so if there were changes made outside of the actual driver code that helped I may have not seen the benefit. It's possible the lagg is adding some complication, but when one of the interfaces wedge the lagg continues to operate over the other link (though half of the traffic simply fails). It appears the interface just runs out of one of its buffers, and is helpless to resolve it without a bounce. I do recall coming across the ASPM threads, but my Supermicro boards didn't have the option and many people claimed it didn't resolve it, so I didn't follow through. I'll do a bit more digging there, thanks. Disabling MSI-X has without a doubt completely resolved my problem though. I would receive about 30 reports/failures a day from my servers when I was running with it, since disabling it I haven't received a single one in ~40 days. The servers are currently running with the 7.2.3 driver also, so if nothing jumps out from my original email I'm happy to re enable it on a handful of servers and collect some fresh reports. Jason On Fri, Oct 7, 2011 at 8:06 AM, Mike Tancsa wrote: > On 10/6/2011 7:15 PM, Jason Wolfe wrote: > > I'm seeing the interface wedge on a good number of systems with Intel > 82574L > > chips under FBSD8.2 _only when MSI-X is enabled_, running either 7.1.9 > from > > 8.2-RELEASE or 7.2.3 from 8.2-STABLE. I have em0 and em1 in a lagg, but > > only one side would fail, and a few systems that didn't have a lagg also > saw > > the issue. Higher traffic did seem to increase the likely hood of it > > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.1.9 > > > Hi, > This sure sounds like the issue I was seeing with the 7.1.9 > driver... > However, it has been fixed for me by going to 7.2.3, which is in > RELENG_8. Is it possible you have a couple of issues going on since you > are using lagg as well ? Another problem some folks have reported is > that in the BIOS, if you have an option for ASPM, make sure its disabled. > > Google around for ASPM and 82574L for a discussion about it. > > If I recall correctly, disabling MSI-X just reduces the chance of the > problem happening, but its been a while since I ran into this issue. > > But for sure you want to be running 7.2.3 from stable > > This server used to see this issue > > dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.2.3 > dev.em.1.%driver: em > dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART > dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 > subdevice=0x34ec class=0x020000 > dev.em.1.%parent: pci11 > dev.em.1.nvm: -1 > dev.em.1.debug: -1 > dev.em.1.rx_int_delay: 0 > dev.em.1.tx_int_delay: 66 > dev.em.1.rx_abs_int_delay: 66 > dev.em.1.tx_abs_int_delay: 66 > dev.em.1.rx_processing_limit: 100 > dev.em.1.flow_control: 3 > dev.em.1.eee_control: 0 > dev.em.1.link_irq: 0 > dev.em.1.mbuf_alloc_fail: 0 > dev.em.1.cluster_alloc_fail: 0 > dev.em.1.dropped: 0 > dev.em.1.tx_dma_fail: 0 > dev.em.1.rx_overruns: 0 > dev.em.1.watchdog_timeouts: 0 > dev.em.1.device_control: 1209008712 > dev.em.1.rx_control: 67141634 > dev.em.1.fc_high_water: 18432 > dev.em.1.fc_low_water: 16932 > dev.em.1.queue0.txd_head: 754 > dev.em.1.queue0.txd_tail: 754 > dev.em.1.queue0.tx_irq: 251430977 > dev.em.1.queue0.no_desc_avail: 0 > dev.em.1.queue0.rxd_head: 304 > dev.em.1.queue0.rxd_tail: 303 > dev.em.1.queue0.rx_irq: 295670362 > dev.em.1.mac_stats.excess_coll: 0 > dev.em.1.mac_stats.single_coll: 0 > dev.em.1.mac_stats.multiple_coll: 0 > dev.em.1.mac_stats.late_coll: 0 > dev.em.1.mac_stats.collision_count: 0 > dev.em.1.mac_stats.symbol_errors: 0 > dev.em.1.mac_stats.sequence_errors: 0 > dev.em.1.mac_stats.defer_count: 0 > dev.em.1.mac_stats.missed_packets: 0 > dev.em.1.mac_stats.recv_no_buff: 0 > dev.em.1.mac_stats.recv_undersize: 0 > dev.em.1.mac_stats.recv_fragmented: 0 > dev.em.1.mac_stats.recv_oversize: 0 > dev.em.1.mac_stats.recv_jabber: 0 > dev.em.1.mac_stats.recv_errs: 0 > dev.em.1.mac_stats.crc_errs: 0 > dev.em.1.mac_stats.alignment_errs: 0 > dev.em.1.mac_stats.coll_ext_errs: 0 > dev.em.1.mac_stats.xon_recvd: 0 > dev.em.1.mac_stats.xon_txd: 0 > dev.em.1.mac_stats.xoff_recvd: 0 > dev.em.1.mac_stats.xoff_txd: 0 > dev.em.1.mac_stats.total_pkts_recvd: 712410384 > dev.em.1.mac_stats.good_pkts_recvd: 712410384 > dev.em.1.mac_stats.bcast_pkts_recvd: 52263 > dev.em.1.mac_stats.mcast_pkts_recvd: 24921 > dev.em.1.mac_stats.rx_frames_64: 170050 > dev.em.1.mac_stats.rx_frames_65_127: 32571360 > dev.em.1.mac_stats.rx_frames_128_255: 19796510 > dev.em.1.mac_stats.rx_frames_256_511: 6283830 > dev.em.1.mac_stats.rx_frames_512_1023: 7922330 > dev.em.1.mac_stats.rx_frames_1024_1522: 645666304 > dev.em.1.mac_stats.good_octets_recvd: 988128549661 > dev.em.1.mac_stats.good_octets_txd: 48849605092 > dev.em.1.mac_stats.total_pkts_txd: 501680484 > dev.em.1.mac_stats.good_pkts_txd: 501680484 > dev.em.1.mac_stats.bcast_pkts_txd: 4266 > dev.em.1.mac_stats.mcast_pkts_txd: 8 > dev.em.1.mac_stats.tx_frames_64: 134256137 > dev.em.1.mac_stats.tx_frames_65_127: 291152180 > dev.em.1.mac_stats.tx_frames_128_255: 67219002 > dev.em.1.mac_stats.tx_frames_256_511: 5935140 > dev.em.1.mac_stats.tx_frames_512_1023: 812920 > dev.em.1.mac_stats.tx_frames_1024_1522: 2305105 > dev.em.1.mac_stats.tso_txd: 366978 > dev.em.1.mac_stats.tso_ctx_fail: 0 > dev.em.1.interrupts.asserts: 2 > dev.em.1.interrupts.rx_pkt_timer: 0 > dev.em.1.interrupts.rx_abs_timer: 0 > dev.em.1.interrupts.tx_pkt_timer: 0 > dev.em.1.interrupts.tx_abs_timer: 0 > dev.em.1.interrupts.tx_queue_empty: 0 > dev.em.1.interrupts.tx_queue_min_thresh: 0 > dev.em.1.interrupts.rx_desc_min_thresh: 0 > dev.em.1.interrupts.rx_overrun: 0 > > interrupt total rate > irq4: uart0 44896 0 > irq16: bge0 19753077 32 > irq18: arcmsr0 37518694 62 > irq19: twa0 556664 0 > irq21: ehci0 2149928 3 > irq23: ehci1 1209435 2 > cpu0: timer 1209274084 2000 > irq256: siis0 65793731 108 > irq257: em0 504313285 834 > irq258: em1:rx 0 295681170 489 > irq259: em1:tx 0 251430780 415 > irq261: ahci0 71285304 117 > cpu1: timer 1209264969 2000 > cpu3: timer 1209266038 2000 > cpu2: timer 1209265460 2000 > Total 6086807515 10067 > > vendor = 'Intel Corporation' > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' > class = network > subclass = ethernet > bar [10] = type Memory, range 32, base 0xb4100000, size 131072, > enabled > bar [18] = type I/O Port, range 32, base 0x2000, size 32, enabled > bar [1c] = type Memory, range 32, base 0xb4120000, size 16384, enabled > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected > ecap 0003[140] = Serial 1 001517ffffed68a4 > > > ---Mike > > > -- > ------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > Cambridge, Ontario Canada http://www.tancsa.com/ >