Date: Sat, 07 May 2016 00:08:50 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-amd64@FreeBSD.org Subject: [Bug 209351] VLAN TX errors, possible performance regression after 10.1-STABLE (r281235) Message-ID: <bug-209351-6@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209351 Bug ID: 209351 Summary: VLAN TX errors, possible performance regression after 10.1-STABLE (r281235) Product: Base System Version: 11.0-CURRENT Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: zclaudio@bsd.com.br CC: freebsd-amd64@FreeBSD.org CC: freebsd-amd64@FreeBSD.org On a BGP, running FreeBSD 10.1-STABLE, version r281235 and it works fine for several years now. After upgrading to any newer version I start having vlan= TX errors on the exact same hardware, just booting an SSD with a newer system. Details: We have around 4Gbit/s and 1.8Mpps routed on peak while per port interface = we peak at 300Kpps. Our quality metrics are measured with: ping -s 1472 -i 0.1 <our-other-ibgp-router> As well as iperf bidirecional. Systems working w/o problem: - 10.1-STABLE / r281235 Systems tested with drops: - 10.2-STABLE / r292035M - 10.3-STABLE / r298705 - 11.0-CURRENT / r295683 (downloaded snapshot from ftp.freebsd.org) - 11.0-CURRENT Melifaro Routing Branch / r297731M While testing, when errors happen I can see output errs on the vlan port on= the output from "netstat -w1 -I vlan6" input vlan6 output packets errs idrops bytes packets errs bytes colls 1 0 0 66 30557 2 33310968 0 1 0 0 105 31458 3 33912219 0 2 0 0 2954 32001 8 34983986 0 1 0 0 1512 33150 6 35942558 0 1 0 0 1512 33654 4 37311862 0 1 0 0 1512 34825 3 38213793 0 3 0 0 1683 35376 4 39488912 0 5 0 0 7280 32423 3 35551869 0 Problems may happen under high load (~200Kpps) or low load (~30Kpps) on a v= lan port.=20 The observed frame loss never happens on untagged ports, only vlan related. The observed loss happens with packets sized 900 bytes and above but notice= ably loss rate is higher with packets close to 1400 (1472 is my reference size). Loss rate on all listed systems different from r281235 is 9-19% with ping(1) and iperf, while it's 0% (no loss or very irrelevant loss) on r281235. Hardware tried: - Intel 82599EB 10-Gigabit SFI/SFP+ Network Connection (2x2 on x8 PCIe bus, total 4x10G). - Chelsio T520, 2x2 on x8PCIe bus, total 4x10G Exactly the same behavior, so it's not Intel related/exclusive. Same hardware: I always test the very same hardware, I have two SSD drives in this router,= one for the 10.1 which just runs fine and the other disk to test the various versions of FreeBSD. Sysctl/loader: Only minor loader and sysctl confs are tweaked: kern.hz=3D2000 net.inet.ip.redirect=3D1 # do not send IP redirects net.inet.ip.accept_sourceroute=3D0 # drop source routed packets since = they ca net.inet.ip.sourceroute=3D0 # if source routed packets are acce= pted th net.inet.tcp.drop_synfin=3D1 # SYN/FIN packets get dropped on in= itial c net.inet.udp.blackhole=3D1 # drop udp packets destined for clo= sed soc net.inet.tcp.blackhole=3D2 # drop tcp packets destined for clo= sed por security.bsd.see_other_uids=3D0 Netstat output when errors happen: input vlan6 output packets errs idrops bytes packets errs bytes colls 1 0 0 66 30557 2 33310968 0 1 0 0 105 31458 3 33912219 0 2 0 0 2954 32001 8 34983986 0 1 0 0 1512 33150 6 35942558 0 1 0 0 1512 33654 4 37311862 0 1 0 0 1512 34825 3 38213793 0 3 0 0 1683 35376 4 39488912 0 5 0 0 7280 32423 3 35551869 0 No relevant errors on the phisical ix(4) o cxl(4) ports happen. It's very easy to simulate/reproduce in my environment, I just need to boot= a newer system and very soon some vlan start to drop packets which are not dropped on 10.1-STABLE and I can be contacted if a developer want to ssh in= . I can also updated this PR with more informatio if needed. --=20 You are receiving this mail because: You are on the CC list for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-209351-6>