From owner-freebsd-bugs@freebsd.org Sat May 7 00:08:50 2016 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6CD30B318AB for ; Sat, 7 May 2016 00:08:50 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 51AC21DA0 for ; Sat, 7 May 2016 00:08:50 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u4708oUK097184 for ; Sat, 7 May 2016 00:08:50 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 209351] VLAN TX errors, possible performance regression after 10.1-STABLE (r281235) Date: Sat, 07 May 2016 00:08:50 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: zclaudio@bsd.com.br X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter cc Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 May 2016 00:08:50 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209351 Bug ID: 209351 Summary: VLAN TX errors, possible performance regression after 10.1-STABLE (r281235) Product: Base System Version: 11.0-CURRENT Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: zclaudio@bsd.com.br CC: freebsd-amd64@FreeBSD.org CC: freebsd-amd64@FreeBSD.org On a BGP, running FreeBSD 10.1-STABLE, version r281235 and it works fine for several years now. After upgrading to any newer version I start having vlan= TX errors on the exact same hardware, just booting an SSD with a newer system. Details: We have around 4Gbit/s and 1.8Mpps routed on peak while per port interface = we peak at 300Kpps. Our quality metrics are measured with: ping -s 1472 -i 0.1 As well as iperf bidirecional. Systems working w/o problem: - 10.1-STABLE / r281235 Systems tested with drops: - 10.2-STABLE / r292035M - 10.3-STABLE / r298705 - 11.0-CURRENT / r295683 (downloaded snapshot from ftp.freebsd.org) - 11.0-CURRENT Melifaro Routing Branch / r297731M While testing, when errors happen I can see output errs on the vlan port on= the output from "netstat -w1 -I vlan6" input vlan6 output packets errs idrops bytes packets errs bytes colls 1 0 0 66 30557 2 33310968 0 1 0 0 105 31458 3 33912219 0 2 0 0 2954 32001 8 34983986 0 1 0 0 1512 33150 6 35942558 0 1 0 0 1512 33654 4 37311862 0 1 0 0 1512 34825 3 38213793 0 3 0 0 1683 35376 4 39488912 0 5 0 0 7280 32423 3 35551869 0 Problems may happen under high load (~200Kpps) or low load (~30Kpps) on a v= lan port.=20 The observed frame loss never happens on untagged ports, only vlan related. The observed loss happens with packets sized 900 bytes and above but notice= ably loss rate is higher with packets close to 1400 (1472 is my reference size). Loss rate on all listed systems different from r281235 is 9-19% with ping(1) and iperf, while it's 0% (no loss or very irrelevant loss) on r281235. Hardware tried: - Intel 82599EB 10-Gigabit SFI/SFP+ Network Connection (2x2 on x8 PCIe bus, total 4x10G). - Chelsio T520, 2x2 on x8PCIe bus, total 4x10G Exactly the same behavior, so it's not Intel related/exclusive. Same hardware: I always test the very same hardware, I have two SSD drives in this router,= one for the 10.1 which just runs fine and the other disk to test the various versions of FreeBSD. Sysctl/loader: Only minor loader and sysctl confs are tweaked: kern.hz=3D2000 net.inet.ip.redirect=3D1 # do not send IP redirects net.inet.ip.accept_sourceroute=3D0 # drop source routed packets since = they ca net.inet.ip.sourceroute=3D0 # if source routed packets are acce= pted th net.inet.tcp.drop_synfin=3D1 # SYN/FIN packets get dropped on in= itial c net.inet.udp.blackhole=3D1 # drop udp packets destined for clo= sed soc net.inet.tcp.blackhole=3D2 # drop tcp packets destined for clo= sed por security.bsd.see_other_uids=3D0 Netstat output when errors happen: input vlan6 output packets errs idrops bytes packets errs bytes colls 1 0 0 66 30557 2 33310968 0 1 0 0 105 31458 3 33912219 0 2 0 0 2954 32001 8 34983986 0 1 0 0 1512 33150 6 35942558 0 1 0 0 1512 33654 4 37311862 0 1 0 0 1512 34825 3 38213793 0 3 0 0 1683 35376 4 39488912 0 5 0 0 7280 32423 3 35551869 0 No relevant errors on the phisical ix(4) o cxl(4) ports happen. It's very easy to simulate/reproduce in my environment, I just need to boot= a newer system and very soon some vlan start to drop packets which are not dropped on 10.1-STABLE and I can be contacted if a developer want to ssh in= . I can also updated this PR with more informatio if needed. --=20 You are receiving this mail because: You are the assignee for the bug.=