Date: Sat, 20 Nov 2010 21:18:00 +0200 From: Naujikas Rolandas <Rolandas.Naujikas@mif.vu.lt> To: Jeremy Chadwick <freebsd@jdc.parodius.com> Cc: freebsd-stable@freebsd.org, jfvogel@gmail.com Subject: Re: problems with network on em Message-ID: <BD7BD29F-699E-4AE4-8E7E-6B15AC58D488@mif.vu.lt> In-Reply-To: <20101120170529.GA95574@icarus.home.lan> References: <FAAB9340-52AB-4874-97D7-152B7FA0B466@gmail.com> <20101120155433.GA94454@icarus.home.lan> <ED928FE6-E085-4ECA-9BFE-4015C57DE749@gmail.com> <1C336756-1447-4346-BFC6-0CE0856F5FA9@mif.vu.lt> <20101120170529.GA95574@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
I'm trying to test with newest version of /sys/dev/e1000 from FreeBSD 8-STABLE. For that I'm using loadable module option, because it is easier to build with minimal changes in kernel source. Only /sys/dev/e1000 and /sys/modules/em need to be updated. Without changes in /sys/modules/em/Makefile it compiles, but have missing symbol or if you compile static kernel - the same problem. Now I'm testing and it looks promising (except I see a little bigger kernel thread netisr cpu load, but it's acceptable). Regards, Rolandas Naujikas On 2010.11.20, at 19:05, Jeremy Chadwick wrote: > On Sat, Nov 20, 2010 at 06:38:19PM +0200, Naujikas Rolandas wrote: >> I just got another lockup. >> It looks like in the time of lockup the number of Ierrs is increasing: >> Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll >> em2 1500 <Link#3> 00:14:4f:XX:XX:XX 13060395 18438 0 6579984 1 0 >> >> After "ifconfig em2 down;ifconfig em2 up" Ierrs stays at 0 rate for long time. >> Without DEVICE_POLLING it was similar situation. >> >> Regards, Rolandas Naujikas >> >> On 2010.11.20, at 18:24, rolnas@gmail.com wrote: >> >>> On 2010.11.20, at 17:54, Jeremy Chadwick wrote: >>> >>>> On Sat, Nov 20, 2010 at 05:09:28PM +0200, rolnas@gmail.com wrote: >>>>> I'm experiencing network interface stalls on em in FreeBSD 8.1-RELEASE (-p1). >>>>> It looks like the problem could be solved in 8-STABLE, but should I upgrade to it ? >>>>> Is it OK to try to get only em driver code and recompile as module and try to run it ? >>>>> >>>>> sysctl dev.em.2.stats=1: >>>>> ... >>>>> em2: Missed Packets = 101334 >>>>> em2: Receive No Buffers = 488 >>>>> ... >>>>> em2: RX overruns = 1356 >>>>> em2: watchdog timeouts = 1 >>>>> ... >>>>> >>>>> Only "ifconfig em2 down;ifconfig em2 up" helps for some time. >>>>> The same happens on em0 interface only, but not in the same time. >>>>> It is production (NAT) router with pf+pfsync+carp and failover over another router. >>>>> They are old "SunFire X4100" boxes (4GB RAM, 2*2 AMD Opteron 2.2GHz). >>>> >>>> You're going to need to provide output from the following, run as root. >>>> For the pciconf command, please only include the entry that's relevant >>>> to the device in question (em2). You can also XXX-out the MAC address >>>> and/or IP addresses if you're worried about security. >>>> >>>> $ pciconf -lvc >>> >>> em2@pci0:1:2:0: class=0x020000 card=0x10118086 chip=0x10108086 rev=0x03 hdr=0x00 >>> vendor = 'Intel Corporation' >>> device = 'Dual Port Gigabit Ethernet Controller (Copper) (82546EB)' >>> class = network >>> subclass = ethernet >>> cap 01[dc] = powerspec 2 supports D0 D3 current D0 >>> cap 07[e4] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split transaction >>> cap 05[f0] = MSI supports 1 message, 64 bit >>> >>>> $ dmesg | grep em2 >>> >>> em2: <Intel(R) PRO/1000 Legacy Network Connection 1.0.1> port 0x9400-0x943f mem 0xfbfa0000-0xfbfbffff irq 24 at device 2.0 on pci1 >>> em2: [FILTER] >>> em2: Ethernet address: 00:14:4f:XX:XX:XX >>> >>>> $ sysctl dev.em.2 >>> >>> dev.em.2.%desc: Intel(R) PRO/1000 Legacy Network Connection 1.0.1 >>> dev.em.2.%driver: em >>> dev.em.2.%location: slot=2 function=0 >>> dev.em.2.%pnpinfo: vendor=0x8086 device=0x1010 subvendor=0x8086 subdevice=0x1011 class=0x020000 >>> dev.em.2.%parent: pci1 >>> dev.em.2.debug: -1 >>> dev.em.2.stats: -1 >>> dev.em.2.rx_int_delay: 0 >>> dev.em.2.tx_int_delay: 66 >>> dev.em.2.rx_abs_int_delay: 66 >>> dev.em.2.tx_abs_int_delay: 66 >>> dev.em.2.rx_processing_limit: 100 >>> >>>> $ uname -a >>> >>> FreeBSD sunfire1.mif 8.1-RELEASE-p1 FreeBSD 8.1-RELEASE-p1 #2: Thu Nov 18 10:39:07 EET 2010 root@sunfire1.mif:/home/local/obj/usr/src/sys/SUNFIRE amd64 >>> >>> Recompiled with DEVICE_POLLING and HZ=2000, carp and many not used devices removed. >>> >>>> $ netstat -ind -I em2 >>> >>> Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop >>> em2 1500 <Link#3> 00:14:4f:XX:XX:XX 66430440 101334 0 59339619 1 0 0 >>> em2 1500 192.168.0.0/1 192.168.XX.XXX 633845 - - 3815946 - - - >>> ... >>> em0 1500 <Link#1> 00:14:4f:XX:XX:XX 167143400 152726 0 143900328 0 0 0 >>> >>> Regards, Rolandas Naujikas >>> >>>> Thanks. > > Oops, I forgot requesting output from one other command: > > $ vmstat -i > > Adding Jack Vogel to the thread, who might have ideas/comments. Jack, > here's the thread: > > http://lists.freebsd.org/pipermail/freebsd-stable/2010-November/060183.html > > As for my comments: > > Unidirectional errors (input or output) often indicates a duplex > mismatch or some sort of weird "quirk" between one link partner and the > other. I *have* seen cases where both sides are auto-neg and one side > acts like it has the wrong duplex selection despite ifconfig reporting > full-duplex and the switch reporting full. Forcing speed and duplex on > both ends (requires a managed switch; please don't try this with a > generic consumer switch) resolved the problem. > > It could be that there's a driver bug causing this to happen -- down/up > seems to indicate that could be the case -- but every situation needs to > be addressed individually. > > -- > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BD7BD29F-699E-4AE4-8E7E-6B15AC58D488>
