Date: Sat, 20 Nov 2010 23:53:15 +0200 From: Naujikas Rolandas <Rolandas.Naujikas@mif.vu.lt> To: Jack Vogel <jfvogel@gmail.com> Cc: freebsd-stable@freebsd.org, Jeremy Chadwick <freebsd@jdc.parodius.com> Subject: Re: problems with network on em Message-ID: <7A80BA0C-596A-417C-B9E0-B2153276DA10@mif.vu.lt> In-Reply-To: <AANLkTimFQuEdUurAnOJoPNn6WJb7QotTgRK58H64_uFd@mail.gmail.com> References: <FAAB9340-52AB-4874-97D7-152B7FA0B466@gmail.com> <20101120155433.GA94454@icarus.home.lan> <ED928FE6-E085-4ECA-9BFE-4015C57DE749@gmail.com> <1C336756-1447-4346-BFC6-0CE0856F5FA9@mif.vu.lt> <20101120170529.GA95574@icarus.home.lan> <BD7BD29F-699E-4AE4-8E7E-6B15AC58D488@mif.vu.lt> <AANLkTimFQuEdUurAnOJoPNn6WJb7QotTgRK58H64_uFd@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I don't know about version, but I'm using RELENG_8 branch only. It is = FreeBSD 8-STABLE also. Regards, Rolandas Naujikas P.S. I just got ~1Gbit/s (125MB/s,115Kpps) forwarding traffic in testing = (24 nodes was downloading a file with wget from server from another side = of router), but finally there was some deadlock. I'm recovering the data = on it. On 2010.11.20, at 22:37, Jack Vogel wrote: > Did you mean the 7.1.7 version from HEAD ? >=20 > Jack >=20 >=20 > On Sat, Nov 20, 2010 at 11:18 AM, Naujikas Rolandas < > Rolandas.Naujikas@mif.vu.lt> wrote: >=20 >> I'm trying to test with newest version of /sys/dev/e1000 from FreeBSD >> 8-STABLE. >> For that I'm using loadable module option, because it is easier to = build >> with minimal changes in kernel source. >> Only /sys/dev/e1000 and /sys/modules/em need to be updated. >> Without changes in /sys/modules/em/Makefile it compiles, but have = missing >> symbol or if you compile static kernel - the same problem. >> Now I'm testing and it looks promising (except I see a little bigger = kernel >> thread netisr cpu load, but it's acceptable). >>=20 >> Regards, Rolandas Naujikas >>=20 >> On 2010.11.20, at 19:05, Jeremy Chadwick wrote: >>=20 >>> On Sat, Nov 20, 2010 at 06:38:19PM +0200, Naujikas Rolandas wrote: >>>> I just got another lockup. >>>> It looks like in the time of lockup the number of Ierrs is = increasing: >>>> Name Mtu Network Address Ipkts Ierrs Idrop >> Opkts Oerrs Coll >>>> em2 1500 <Link#3> 00:14:4f:XX:XX:XX 13060395 18438 0 >> 6579984 1 0 >>>>=20 >>>> After "ifconfig em2 down;ifconfig em2 up" Ierrs stays at 0 rate for = long >> time. >>>> Without DEVICE_POLLING it was similar situation. >>>>=20 >>>> Regards, Rolandas Naujikas >>>>=20 >>>> On 2010.11.20, at 18:24, rolnas@gmail.com wrote: >>>>=20 >>>>> On 2010.11.20, at 17:54, Jeremy Chadwick wrote: >>>>>=20 >>>>>> On Sat, Nov 20, 2010 at 05:09:28PM +0200, rolnas@gmail.com wrote: >>>>>>> I'm experiencing network interface stalls on em in FreeBSD >> 8.1-RELEASE (-p1). >>>>>>> It looks like the problem could be solved in 8-STABLE, but = should I >> upgrade to it ? >>>>>>> Is it OK to try to get only em driver code and recompile as = module >> and try to run it ? >>>>>>>=20 >>>>>>> sysctl dev.em.2.stats=3D1: >>>>>>> ... >>>>>>> em2: Missed Packets =3D 101334 >>>>>>> em2: Receive No Buffers =3D 488 >>>>>>> ... >>>>>>> em2: RX overruns =3D 1356 >>>>>>> em2: watchdog timeouts =3D 1 >>>>>>> ... >>>>>>>=20 >>>>>>> Only "ifconfig em2 down;ifconfig em2 up" helps for some time. >>>>>>> The same happens on em0 interface only, but not in the same = time. >>>>>>> It is production (NAT) router with pf+pfsync+carp and failover = over >> another router. >>>>>>> They are old "SunFire X4100" boxes (4GB RAM, 2*2 AMD Opteron = 2.2GHz). >>>>>>=20 >>>>>> You're going to need to provide output from the following, run as >> root. >>>>>> For the pciconf command, please only include the entry that's = relevant >>>>>> to the device in question (em2). You can also XXX-out the MAC = address >>>>>> and/or IP addresses if you're worried about security. >>>>>>=20 >>>>>> $ pciconf -lvc >>>>>=20 >>>>> em2@pci0:1:2:0: class=3D0x020000 card=3D0x10118086 chip=3D0x10108086= >> rev=3D0x03 hdr=3D0x00 >>>>> vendor =3D 'Intel Corporation' >>>>> device =3D 'Dual Port Gigabit Ethernet Controller (Copper) >> (82546EB)' >>>>> class =3D network >>>>> subclass =3D ethernet >>>>> cap 01[dc] =3D powerspec 2 supports D0 D3 current D0 >>>>> cap 07[e4] =3D PCI-X 64-bit supports 133MHz, 2048 burst read, 1 = split >> transaction >>>>> cap 05[f0] =3D MSI supports 1 message, 64 bit >>>>>=20 >>>>>> $ dmesg | grep em2 >>>>>=20 >>>>> em2: <Intel(R) PRO/1000 Legacy Network Connection 1.0.1> port >> 0x9400-0x943f mem 0xfbfa0000-0xfbfbffff irq 24 at device 2.0 on pci1 >>>>> em2: [FILTER] >>>>> em2: Ethernet address: 00:14:4f:XX:XX:XX >>>>>=20 >>>>>> $ sysctl dev.em.2 >>>>>=20 >>>>> dev.em.2.%desc: Intel(R) PRO/1000 Legacy Network Connection 1.0.1 >>>>> dev.em.2.%driver: em >>>>> dev.em.2.%location: slot=3D2 function=3D0 >>>>> dev.em.2.%pnpinfo: vendor=3D0x8086 device=3D0x1010 = subvendor=3D0x8086 >> subdevice=3D0x1011 class=3D0x020000 >>>>> dev.em.2.%parent: pci1 >>>>> dev.em.2.debug: -1 >>>>> dev.em.2.stats: -1 >>>>> dev.em.2.rx_int_delay: 0 >>>>> dev.em.2.tx_int_delay: 66 >>>>> dev.em.2.rx_abs_int_delay: 66 >>>>> dev.em.2.tx_abs_int_delay: 66 >>>>> dev.em.2.rx_processing_limit: 100 >>>>>=20 >>>>>> $ uname -a >>>>>=20 >>>>> FreeBSD sunfire1.mif 8.1-RELEASE-p1 FreeBSD 8.1-RELEASE-p1 #2: Thu = Nov >> 18 10:39:07 EET 2010 = root@sunfire1.mif:/home/local/obj/usr/src/sys/SUNFIRE >> amd64 >>>>>=20 >>>>> Recompiled with DEVICE_POLLING and HZ=3D2000, carp and many not = used >> devices removed. >>>>>=20 >>>>>> $ netstat -ind -I em2 >>>>>=20 >>>>> Name Mtu Network Address Ipkts Ierrs Idrop >> Opkts Oerrs Coll Drop >>>>> em2 1500 <Link#3> 00:14:4f:XX:XX:XX 66430440 101334 0 >> 59339619 1 0 0 >>>>> em2 1500 192.168.0.0/1 192.168.XX.XXX 633845 - - >> 3815946 - - - >>>>> ... >>>>> em0 1500 <Link#1> 00:14:4f:XX:XX:XX 167143400 152726 0 >> 143900328 0 0 0 >>>>>=20 >>>>> Regards, Rolandas Naujikas >>>>>=20 >>>>>> Thanks. >>>=20 >>> Oops, I forgot requesting output from one other command: >>>=20 >>> $ vmstat -i >>>=20 >>> Adding Jack Vogel to the thread, who might have ideas/comments. = Jack, >>> here's the thread: >>>=20 >>>=20 >> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-November/060183.htm= l >>>=20 >>> As for my comments: >>>=20 >>> Unidirectional errors (input or output) often indicates a duplex >>> mismatch or some sort of weird "quirk" between one link partner and = the >>> other. I *have* seen cases where both sides are auto-neg and one = side >>> acts like it has the wrong duplex selection despite ifconfig = reporting >>> full-duplex and the switch reporting full. Forcing speed and duplex = on >>> both ends (requires a managed switch; please don't try this with a >>> generic consumer switch) resolved the problem. >>>=20 >>> It could be that there's a driver bug causing this to happen -- = down/up >>> seems to indicate that could be the case -- but every situation = needs to >>> be addressed individually. >>>=20 >>> -- >>> | Jeremy Chadwick jdc@parodius.com = | >>> | Parodius Networking http://www.parodius.com/ = | >>> | UNIX Systems Administrator Mountain View, CA, USA = | >>> | Making life hard for others since 1977. PGP: 4BD6C0CB = | >>>=20 >>=20 >>=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7A80BA0C-596A-417C-B9E0-B2153276DA10>