Date: Thu, 17 Apr 2014 15:45:23 -0600 From: John Nielsen <lists@jnielsen.net> To: Andrea Venturoli <ml@netfence.it> Cc: freebsd-net@freebsd.org Subject: Re: Network troubles after 8.3 -> 8.4 upgrade Message-ID: <CDFEE4B3-CFAD-4E99-B5FE-731FB3C3C6FC@jnielsen.net> In-Reply-To: <53503BC3.6040806@netfence.it> References: <53503BC3.6040806@netfence.it>
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 17, 2014, at 2:38 PM, Andrea Venturoli <ml@netfence.it> wrote: > Three days ago I upgraded an amd64 8.3 box to the latest 8.4. > Since then the outside network is misbehaving: large mails are not = sended (although small ones do), svn operations will work for a while, = then come to a sudden stop, etc... > Perhaps the most evident test is "wget"ting a big file: it will = download some chunk, halt; restart after a while and download another = chunk; lose the connection once again, then restart and so on. >=20 > I remember a couple of similar experiences in the past, from which I = got out by disabling TSO; however those box had fxp cards, while this = has an em. > In any case disabling TSO did not help. My first thought was TSO as well, since I've seen the symptoms you = describe a few times on systems running 10.0. Do you use IPFW or any = kind of NAT on this system? When an application encounters a network = problem, does it report or log anything at all? Anything in the kernel = log/dmesg? A bit of a shot in the dark, but could you try applying r264517 (fixes a = problem with VLAN and TSO interaction)? = http://svnweb.freebsd.org/base/head/sys/net/if_vlan.c?r1=3D257241&r2=3D264= 517 Otherwise my only other thought would be the driver. Can you try = reverting only the em(4) driver back to 8.3? If that helps it would give = you both a workaround and a clue for where to look for a solution. Build = modules and a kernel without em(4) from unmodified 8.4 src, load em(4) = as a module, confirm that the problem persists. Replace the contents of = src/sys/dev/e1000, src/sys/modules/em and src/sys/conf/files with those = from an 8.3 src tree (or otherwise revert revision 247430), rebuild em = module, unload/reload or reboot, see if problem goes away. (Could be = somewhat complicated by the fact that you also have igb interfaces which = also use code from the e1000 directory, but rather than speculate I'll = leave solving that as an exercise for someone else.) JN > This is the relevant part of rc.conf: >> cloned_interfaces=3D"lagg0 vlan1 vlan2 vlan3 carp0 carp1 carp3 carp4 = carp6 carp7 carp9 carp10" >> ifconfig_igb0=3D"up" >> ifconfig_igb1=3D"up" >> ifconfig_lagg0=3D"laggproto lacp laggport igb0 laggport igb1 = 192.168.101.4 netmask 255.255.255.0" >> ifconfig_lagg0_alias0=3D"inet 192.168.101.101 netmask 0xffffffff" >> ifconfig_carp0=3D"vhid 1 advskew 100 pass xxxxxxx 192.168.101.10" >> ifconfig_carp1=3D"vhid 2 pass xxxxxxxx 192.168.101.10" >> ifconfig_em0=3D"up" >> ifconfig_vlan1=3D"inet 81.174.30.11 netmask 255.255.255.248 vlan 4 = vlandev em0" >> ifconfig_vlan2=3D"inet 83.211.188.186 netmask 255.255.255.248 vlan 2 = vlandev em0" >> ifconfig_vlan3=3D"inet 192.168.2.202 netmask 255.255.255.0 vlan 3 = vlandev em0" >> ifconfig_carp3=3D"vhid 4 advskew 100 pass xxxx 81.174.30.12" >> ifconfig_carp4=3D"vhid 5 pass xxxxxxx 81.174.30.12" >> ifconfig_carp6=3D"vhid 7 advskew 100 pass xxxxxx 83.211.188.187" >> ifconfig_carp7=3D"vhid 8 pass xxxxxxxxxxx 83.211.188.187" >> ifconfig_carp9=3D"vhid 10 advskew 100 pass xxxxxxxx 192.168.2.203" >> ifconfig_carp10=3D"vhid 11 pass xxxxxxxx 192.168.2.203" >> ifconfig_lo0_alias0=3D"inet 127.0.0.2 netmask 0xffffffff" >> ifconfig_lo0_alias1=3D"inet 127.0.0.3 netmask 0xffffffff" >> ifconfig_lo0_alias2=3D"inet 127.0.0.4 netmask 0xffffffff" >=20 > As you can see the setup is quite complicated, but worked like a charm = until the upgrade; actually the internal net (igb+lagg+carp) still does, = so this is what points me toward em0, where I cannot seem to get any = kind of stability. >=20 > The card is >> em0@pci0:6:0:0: class=3D0x020000 card=3D0x10828086 chip=3D0x107d8086 = rev=3D0x06 hdr=3D0x00 >> vendor =3D 'Intel Corporation' >> device =3D 'PRO/1000 PT' >> class =3D network >> subclass =3D ethernet >=20 > I tried disabling TSO, RXCSUM, TXCSUM, VLANHWTAG, VLANHWCSUM, = VLANHWTSO... > I tried putting the card into 10baseT/UTP <half-duplex> mode... > I tried sysctl net.inet.tcp.tso=3D0... >=20 > None helped. >=20 > Maybe I'm barking up the wrong tree, but nothing is in the logs to = help... >=20 > Nor did Google or wading through bug reports. >=20 >=20 >=20 > Now I could restore the dumps I made before upgrading to 8.4 (but I'd = really like to avoid this), try to upgrade even further to 9.2 (although = this will be a lot of work and I'm not looking forward to it as a shot = in the dark), drop in another NIC... > What I'd really like, however, is some insight. >=20 > Is this a known problem of some sort? Is this card or this driver = known to be broken? > Is there any way I could get some debugging info? >=20 > Any hint is appreciated (and I need it badly :( !!!). >=20 > bye & Thanks > av. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CDFEE4B3-CFAD-4E99-B5FE-731FB3C3C6FC>