Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Apr 2016 15:41:40 -0300
From:      =?UTF-8?Q?Z=C3=A9_Claudio_Pastore?= <zclaudio@bsd.com.br>
To:        freebsd-net@freebsd.org
Subject:   Regression? VLAN packet drop after upgrading from r281235
Message-ID:  <CAEGk6G4rq=yE14rDcxhJZZ0drstr=fse%2B9aemVYqdt68Gg=bpQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello,

On a BGP border router I help manage, we run FreeBSD 10.1-STABLE,
version r281235 and it works fine for several years now.

We have around 4Gbit/s and 1.8Mpps routed on peak while per port interface
we peak at 300Kpps.

Our quality metrics are measured with:

ping -s 1472 -i 0.1 <our-other-ibgp-router>

As well as iperf bidirecional.

This metric is similar to what Speedy Test and SIMET tests are done and our
customers reference.

Systems working w/o problem:
- 10.1-STABLE / r281235

Systems tested with drops:
- 10.2-STABLE / r292035M
- 10.3-STABLE / r298705
- 11.0-CURRENT / r295683 (downloaded snapshot from ftp.freebsd.org)
- 11.0-CURRENT Melifaro Routing Branch / r297731M

While testing, when errors happen I can see output errs on the vlan port on
the output from "netstat -w1 -I vlan6"

           input          vlan6           output
   packets  errs idrops      bytes    packets  errs      bytes colls
         1     0     0         66      30557     2   33310968     0
         1     0     0        105      31458     3   33912219     0
         2     0     0       2954      32001     8   34983986     0
         1     0     0       1512      33150     6   35942558     0
         1     0     0       1512      33654     4   37311862     0
         1     0     0       1512      34825     3   38213793     0
         3     0     0       1683      35376     4   39488912     0
         5     0     0       7280      32423     3   35551869     0

Problems may happen under high load (~200Kpps) or low load (~30Kpps) on a
vlan port. The observed frame loss never happens on untagged ports, only
vlan related. The observed loss happens with packets sized 900 bytes and
above but noticeably loss rate is higher with packets close to 1400 (1472
is my reference size).

Loss rate on all listed systems different from r281235 is 9-19% with
ping(1) and iperf, while it's 0% on r281235.

First I believed it to be a Intel driver error on systems newer than 10.1.
My reference card are dual port 82599EB 10-Gigabit SFI/SFP+ Network
Connection (2x2 on x8 PCIe bus, total 4x10G). But yesterday I replaced
Intel by Chelsio T5 and the problem is still exactly the same, so it's not
related to card vendor.

I always test the very same hardware, I have two SSD drives in this router,
one for the 10.1 which just runs fine and the other disk to test the
various versions of FreeBSD.

Only minor loader and sysctl confs are tweaked:

kern.hz=2000
net.inet.ip.redirect=1                # do not send IP redirects
net.inet.ip.accept_sourceroute=0      # drop source routed packets since
they ca
net.inet.ip.sourceroute=0             # if source routed packets are
accepted th
net.inet.tcp.drop_synfin=1            # SYN/FIN packets get dropped on
initial c
net.inet.udp.blackhole=1              # drop udp packets destined for
closed soc
net.inet.tcp.blackhole=2              # drop tcp packets destined for
closed por
security.bsd.see_other_uids=0

Can anyone suggest what might be a fix/tuning for this behavior? Was there
any relevant change on vlan code from particular revisions close to the one
I run on 10.1 and later which would lead to such a big difference?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAEGk6G4rq=yE14rDcxhJZZ0drstr=fse%2B9aemVYqdt68Gg=bpQ>