Date: Tue, 18 Feb 2014 09:16:32 -0500 From: George Neville-Neil <gnn@neville-neil.com> To: Kevin Bowling <kevin.bowling@kev009.com> Cc: FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: FreeBSD 10 network flapping, ix driver unreliable? Message-ID: <11F52C6F-1A9C-4D5B-8364-AFB62322CB91@neville-neil.com> In-Reply-To: <ldtvlk$kuc$1@ger.gmane.org> References: <ldohqb$s2c$1@ger.gmane.org> <61748F81-A763-4504-BC81-132D394F0170@neville-neil.com> <ldp7vp$hf7$1@ger.gmane.org> <CE04609E-3C64-42A1-A2E7-BE7E0518AD32@neville-neil.com> <ldtvlk$kuc$1@ger.gmane.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Feb 17, 2014, at 16:41 , Kevin Bowling <kevin.bowling@kev009.com> wrote: > On 2/16/2014 9:04 PM, George Neville-Neil wrote: >> >> On Feb 15, 2014, at 21:32 , Kevin Bowling <kevin.bowling@kev009.com> wrote: >> >>> On 2/15/2014 4:43 PM, George Neville-Neil wrote: >>>> >>>> On Feb 15, 2014, at 15:14 , Kevin Bowling <kevin.bowling@kev009.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> I have FreeBSD 10.0-RELEASE installed on two Dell C6100 nodes. Each node has an Intel X520-DA2 dual port 10gig card. One of the ports on each go to a switch using direct attach coaxial cables. The other port is directly connected between the two nodes (think crossover in twisted pair terminology) again using direct attach coaxial cables. >>>>> >>>>> On both machines, and on both ports (including the "crossover"), the links flap several times per day. >>>>> >>>>> I've pasted the output of lspci -vv and dmesg here: >>>>> https://gist.github.com/kev009/9024442 >>>>> >>>>> There's nothing outstanding about the setup otherwise. I suspected some interaction with the switch initially but the "crossover" has eliminated that suspicion. >>>>> >>>>> It seems the ix driver is not very reliable under common conditions, i.e. https://forums.freebsd.org/viewtopic.php?f=7&t=44570 and a search of this list. Any recommendations or tests? >>>>> >>>> >>>> Can you post (to your gist link) the output of sysctl dev.ix ? >>> >>> Hi George, >>> >>> sysctl info added to gist link. ix0 has been up for around 27 days. ix1 for about 24hrs. >>> >> >> I think this has something to do with it. >> >> dev.ix.0.mac_stats.local_faults: 314 >> dev.ix.0.mac_stats.remote_faults: 41 >> >> The device is seeing errors at the MAC layer, which I don’t think a driver bug would >> cause, though there is always the possibility of a misconfiguration causing flapping. >> Can you try different cables? >> >> When you hook it to the switch does the switch give better diagnostics? Reading >> over the Intel 82599 chip manual is not, shall we say, illuminating, >> "Number of faults in the local MAC. This register is valid only when the link speed is 10 Gb/s.” > > Appreciate your help, this led me to find some new info although it doesn't entirely answer what local_faluts are for me: http://grouper.ieee.org/groups/802/3/ae/public/nov00/taborek_2_1100.pdf > > I may have spoke too soon, the "crossover" ix1 seems to be holding steady, so the local and remote faults must have been during negotiation and me bringing up the interfaces. > > On the other system's ix0, the faults are almost all local and quite a bit more frequent: > dev.ix.0.mac_stats.local_faults: 10752 > dev.ix.0.mac_stats.remote_faults: 2 > > I then noticed the switch had mandatory flow control on both send and receive for 10gig, but the FreeBSD box was only negotiating receive flow control. I disabled both on the switch and rebooted but am still seeing some increments of local_faults. > > Could it be a switch STP problem? Switch is a Cisco 4948-10ge. Configs look like below, which is working well on some copper gigabit interfaces: > > spanning-tree mode pvst > spanning-tree portfast default > spanning-tree extend system-id > ! > interface TenGigabitEthernet1/49 > switchport trunk encapsulation dot1q > switchport mode trunk > spanning-tree portfast trunk > ! > interface TenGigabitEthernet1/50 > switchport trunk encapsulation dot1q > switchport mode trunk > flowcontrol receive desired > flowcontrol send desired > spanning-tree portfast trunk > ! > > It will be hard for me to source SFPs and fiber, but I can try to see if it's a physical layer problem. In the mean time I might try imaging one of the systems with a different OS and seeing if the problem persists. > Another possibility is flow control. Can you try this setting? sysctl dev.ix.0.fc=0 Best, George
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?11F52C6F-1A9C-4D5B-8364-AFB62322CB91>
