Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Feb 2020 23:40:52 -0500
From:      Patrick Kelsey <pkelsey@freebsd.org>
To:        Josh Paetzel <jpaetzel@freebsd.org>
Cc:        freebsd-net <freebsd-net@freebsd.org>
Subject:   Re: terrible if_vmx / vmxnet3 rx performance with lro (post iflib)
Message-ID:  <CAD44qMWuV=ZNYoKNR5gy9UR07=gu-Hkvg38tG7pjs9xZFBNsxA@mail.gmail.com>
In-Reply-To: <ec5d9134-7642-413a-bf20-f15012bc332c@www.fastmail.com>
References:  <40c4a4df-3df6-d95d-53c2-eef905ff45b1@FreeBSD.org> <5e5d423b-0711-7454-626a-cc9cb4b004cd@FreeBSD.org> <ec5d9134-7642-413a-bf20-f15012bc332c@www.fastmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Feb 20, 2020 at 4:58 PM Josh Paetzel <jpaetzel@freebsd.org> wrote:

>
>
> On Wed, Feb 19, 2020, at 7:17 AM, Andriy Gapon wrote:
> > On 18/02/2020 16:09, Andriy Gapon wrote:
> > > My general experience with post-iflib vmxnet3 is that vmxnet3 has some
> > > peculiarities that result in a certain "impedance mismatch" with iflib.
> > > Although we now have a bit less code and it is a bit more regular,
> there are a
> > > few significant (for us, at least) problems:
> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243126
> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240608
> >
> > By the way, we (Panzura) use these changes to fix or work around the
> above two
> > problems: https://people.freebsd.org/~avg/iflib-vmx.pz.diff
> >
> > Questions / comments are welcome.
> > Especially from people who worked on iflib.
> >
> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243392
> > > - the problem described above
> > > - a couple of issues that we already fixed or worked around
> > >
> > > We are contemplating locally reverting to the pre-iflib vmxnet3 and we
> are
> > > wondering if the conversion was really worth it in general.
> >
> >
> > --
> > Andriy Gapon
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> >
>
> I'd like to follow this up just to make it 100% clear.  The problem is a
> ~4x regression in RX performance.  It affects stock FreeBSD, including
> 12.1-RELEASE.
>
> In my 40Gbps connected lab single thread iperf receive went from 9Gbps to
> 2.5Gbps.
>
> If this can't be fixed or looked at I'd heavily suggest looking at
> reverting "iflib"ing change in stock FreeBSD.
>
>
Consider these datapoints I collected this evening:

Hypervisor: ESXi 6.7.0 Build 8169922
Hardware: Xeon E5-1650 v3 @ 3.50GHz (6 physical cores, HT disabled)

iperf3 client: a VM on the same vswitch as the VM under test, running
Ubuntu 18.04.3 LTS with 2 vCPUs, 4GB RAM, and a VMXNET3 interface used only
for traffic to the VM under test (this VMXNET3 has checksum offload,
TSO/GSO, and LRO/GRO enabled)
iperf3 server: running on the VM under test, either a 12.0-RELEASE VM (this
is before the vmx iflib conversion), or a 12.1-RELEASE VM (this is after
the vmx iflib conversion) with r356703 applied (the recent TSO bug fix).
Both VMs have 3 vCPUs, but the vmx interface only uses 1 tx and 1 rx queue,
as hw.pci.honor_msi_blacklist is at its default of 0, so MSI is used.


Test 1: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO enabled,
LRO disabled
======
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6>
$ iperf3 -c <12.0 VM IP> -p 1234
Connecting to host <12.0 VM IP>, port 1234
[  4] local <Ubuntu VM IP> port 44664 connected to <12.0 VM IP> port 1234
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  1.11 GBytes  9.52 Gbits/sec  1144    529 KBytes
[  4]   1.00-2.00   sec  1.09 GBytes  9.40 Gbits/sec  1272    369 KBytes
[  4]   2.00-3.00   sec  1.11 GBytes  9.51 Gbits/sec  1249    344 KBytes
[  4]   3.00-4.00   sec  1.06 GBytes  9.12 Gbits/sec  1973    369 KBytes
[  4]   4.00-5.00   sec  1.11 GBytes  9.50 Gbits/sec  1860    370 KBytes
[  4]   5.00-6.00   sec  1.08 GBytes  9.28 Gbits/sec  1342    396 KBytes
[  4]   6.00-7.00   sec  1.09 GBytes  9.38 Gbits/sec  1278    563 KBytes
[  4]   7.00-8.00   sec  1.05 GBytes  8.99 Gbits/sec  1226    372 KBytes
[  4]   8.00-9.00   sec  1.03 GBytes  8.87 Gbits/sec  1145    400 KBytes
[  4]   9.00-10.00  sec  1.08 GBytes  9.28 Gbits/sec  1317    354 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  10.8 GBytes  9.28 Gbits/sec  13806
sender
[  4]   0.00-10.00  sec  10.8 GBytes  9.28 Gbits/sec
 receiver


Test 2: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO enabled,
LRO enabled
======
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=60079b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
$ iperf3 -c <12.0 VM IP> -p 1234
Connecting to host <12.0 VM IP>, port 1234
[  4] local <Ubuntu VM IP> port 44714 connected to <12.0 VM IP> port 1234
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  3.48 GBytes  29.9 Gbits/sec    0    887 KBytes
[  4]   1.00-2.00   sec  1.93 GBytes  16.6 Gbits/sec    0    994 KBytes
[  4]   2.00-3.00   sec  2.03 GBytes  17.5 Gbits/sec    0   1.10 MBytes
[  4]   3.00-4.00   sec  1.99 GBytes  17.1 Gbits/sec    0   1.10 MBytes
[  4]   4.00-5.00   sec  2.00 GBytes  17.1 Gbits/sec    0   1.10 MBytes
[  4]   5.00-6.00   sec  1.93 GBytes  16.6 Gbits/sec    0   1.10 MBytes
[  4]   6.00-7.00   sec  2.04 GBytes  17.5 Gbits/sec    0   1.10 MBytes
[  4]   7.00-8.00   sec  2.01 GBytes  17.3 Gbits/sec    0   1.10 MBytes
[  4]   8.00-9.00   sec  1.97 GBytes  16.9 Gbits/sec    0   1.10 MBytes
[  4]   9.00-10.00  sec  1.98 GBytes  17.0 Gbits/sec    0   1.10 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  21.4 GBytes  18.3 Gbits/sec    0             sender
[  4]   0.00-10.00  sec  21.4 GBytes  18.3 Gbits/sec
 receiver


Test 3: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO enabled,
LRO disabled (LRO disabled and test run after Test 2 above)
======
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6>
$ iperf3 -c <12.0 VM IP> -p 1234
Connecting to host <12.0 VM IP>, port 1234
[  4] local <Ubuntu VM IP> port 44718 connected to <12.0 VM IP> port 1234
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  1.14 GBytes  9.76 Gbits/sec  1871    338 KBytes
[  4]   1.00-2.00   sec   483 MBytes  4.05 Gbits/sec  1307   1.41 KBytes
[  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
[  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
[  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
[  4]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
[  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
[  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
[  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
[  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1.61 GBytes  1.38 Gbits/sec  3181
sender
[  4]   0.00-10.00  sec  1.60 GBytes  1.38 Gbits/sec
 receiver


Test 4: 12.0-RELEASE, single TCP stream transmit, standard mtu, TSO
enabled, LRO enabled
======
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=60079b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
$ iperf3 -R -c <12.0 VM IP> -p 1234
Connecting to host <12.0 VM IP>, port 1234
Reverse mode, remote host <12.0 VM IP> is sending
[  4] local <Ubuntu VM IP> port 44726 connected to <12.0 VM IP> port 1234
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  4.28 GBytes  36.8 Gbits/sec
[  4]   1.00-2.00   sec  3.31 GBytes  28.4 Gbits/sec
[  4]   2.00-3.00   sec  3.85 GBytes  33.1 Gbits/sec
[  4]   3.00-4.00   sec  4.24 GBytes  36.5 Gbits/sec
[  4]   4.00-5.00   sec  3.16 GBytes  27.1 Gbits/sec
[  4]   5.00-6.00   sec  3.54 GBytes  30.4 Gbits/sec
[  4]   6.00-7.00   sec  4.03 GBytes  34.6 Gbits/sec
[  4]   7.00-8.00   sec  2.93 GBytes  25.1 Gbits/sec
[  4]   8.00-9.00   sec  3.42 GBytes  29.4 Gbits/sec
[  4]   9.00-10.00  sec  3.93 GBytes  33.8 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  36.7 GBytes  31.5 Gbits/sec  280             sender
[  4]   0.00-10.00  sec  36.7 GBytes  31.5 Gbits/sec
 receiver


Test 5: 12.1-RELEASE with r356703 applied, single stream receive, standard
mtu, TSO enabled, LRO disabled
======
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
$ iperf3 -c <12.1 VM IP> -p 1234
Connecting to host <12.1 VM IP>, port 1234
[  4] local <Ubuntu VM IP> port 48392 connected to <12.1 VM IP> port 1234
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   828 MBytes  6.95 Gbits/sec  1247    335 KBytes
[  4]   1.00-2.00   sec   901 MBytes  7.56 Gbits/sec  1841    345 KBytes
[  4]   2.00-3.00   sec   909 MBytes  7.62 Gbits/sec  1805    356 KBytes
[  4]   3.00-4.00   sec   909 MBytes  7.62 Gbits/sec  2337    322 KBytes
[  4]   4.00-5.00   sec   907 MBytes  7.61 Gbits/sec  1834    354 KBytes
[  4]   5.00-6.00   sec   907 MBytes  7.61 Gbits/sec  1984    352 KBytes
[  4]   6.00-7.00   sec   909 MBytes  7.62 Gbits/sec  2189    329 KBytes
[  4]   7.00-8.00   sec   908 MBytes  7.62 Gbits/sec  2000    338 KBytes
[  4]   8.00-9.00   sec   907 MBytes  7.61 Gbits/sec  2006    315 KBytes
[  4]   9.00-10.00  sec   908 MBytes  7.61 Gbits/sec  1764    332 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  8.78 GBytes  7.54 Gbits/sec  19007
sender
[  4]   0.00-10.00  sec  8.78 GBytes  7.54 Gbits/sec
 receiver


Test 6: 12.1-RELEASE with r356703 applied, single stream receive, standard
mtu, TSO enabled, LRO disabled, sysctl dev.vmx.0.iflib.tx_abdicate=1
======
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
$ iperf3 -c <12.1 VM IP> -p 1234
Connecting to host <12.1 VM IP>, port 1234
[  4] local <Ubuntu VM IP> port 48416 connected to <12.1 VM IP> port 1234
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  1.29 GBytes  11.1 Gbits/sec  3016    290 KBytes
[  4]   1.00-2.00   sec  1.33 GBytes  11.4 Gbits/sec  4133    322 KBytes
[  4]   2.00-3.00   sec  1.34 GBytes  11.5 Gbits/sec  5409    335 KBytes
[  4]   3.00-4.00   sec  1.35 GBytes  11.6 Gbits/sec  3899    376 KBytes
[  4]   4.00-5.00   sec  1.35 GBytes  11.6 Gbits/sec  4609    300 KBytes
[  4]   5.00-6.00   sec  1.35 GBytes  11.6 Gbits/sec  4603    303 KBytes
[  4]   6.00-7.00   sec  1.36 GBytes  11.7 Gbits/sec  4417    293 KBytes
[  4]   7.00-8.00   sec  1.34 GBytes  11.5 Gbits/sec  5680    290 KBytes
[  4]   8.00-9.00   sec  1.33 GBytes  11.5 Gbits/sec  5461    359 KBytes
[  4]   9.00-10.00  sec  1.03 GBytes  8.86 Gbits/sec  5060    329 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  13.1 GBytes  11.2 Gbits/sec  46287
sender
[  4]   0.00-10.00  sec  13.1 GBytes  11.2 Gbits/sec
 receiver


Test 7: 12.1-RELEASE with r356703 applied, single stream receive, standard
mtu, TSO enabled, LRO enabled
======
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=e407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
$ iperf3 -c <12.1 VM IP> -p 1234
Connecting to host <12.1 VM IP>, port 1234
[  4] local <Ubuntu VM IP> port 48396 connected to <12.1 VM IP> port 1234
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  98.5 MBytes   826 Mbits/sec  129   2.83 KBytes
[  4]   1.00-2.00   sec  63.6 KBytes   521 Kbits/sec   25   2.83 KBytes
[  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec   25   2.83 KBytes
[  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec   16   2.83 KBytes
[  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec   15   2.83 KBytes
[  4]   5.00-6.00   sec  63.6 KBytes   521 Kbits/sec   15   2.83 KBytes
[  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec   15   2.83 KBytes
[  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec   12   2.83 KBytes
[  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec   15   2.83 KBytes
[  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec   11   1.41 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  98.7 MBytes  82.8 Mbits/sec  278             sender
[  4]   0.00-10.00  sec  97.8 MBytes  82.0 Mbits/sec
 receiver


Test 8: 12.1-RELEASE with r356703 applied, single stream transmit, standard
mtu, TSO enabled, LRO disabled
======
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
$ iperf3 -R -c <12.1 VM IP> -p 1234
Connecting to host <12.1 VM IP>, port 1234
Reverse mode, remote host <12.1 VM IP> is sending
[  4] local <Ubuntu VM IP> port 48400 connected to <12.1 VM IP> port 1234
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  4.25 GBytes  36.5 Gbits/sec
[  4]   1.00-2.00   sec  3.29 GBytes  28.3 Gbits/sec
[  4]   2.00-3.00   sec  3.61 GBytes  31.0 Gbits/sec
[  4]   3.00-4.00   sec  3.93 GBytes  33.8 Gbits/sec
[  4]   4.00-5.00   sec  4.17 GBytes  35.8 Gbits/sec
[  4]   5.00-6.00   sec  3.53 GBytes  30.3 Gbits/sec
[  4]   6.00-7.00   sec  3.22 GBytes  27.7 Gbits/sec
[  4]   7.00-8.00   sec  3.90 GBytes  33.5 Gbits/sec
[  4]   8.00-9.00   sec  2.80 GBytes  24.1 Gbits/sec
[  4]   9.00-10.00  sec  2.78 GBytes  23.9 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  35.5 GBytes  30.5 Gbits/sec  571             sender
[  4]   0.00-10.00  sec  35.5 GBytes  30.5 Gbits/sec
 receiver


Based on the above, it looks like:

(1) The non-LRO single-stream TCP receive performance of the iflib vmx
driver in 12.1 release lags behind the non-LRO single-stream TCP receive
performance of the pre-iflib vmx driver in 12.0 (by about 20%, 7.54 Gbps
[Test 5] vs 9.28 Gbps [Test 1]), unless tx_abdicate is enabled, in which
case the vmx driver performs better (by about 20%, 11.2 Gbps [Test 6] vs
9.28 Gbps [Test 1]).

(2) The TSO-enabled single-stream TCP send performance of the iflib vmx
driver in 12.1 release (with TSO bug patch applied) is at parity with the
pre-iflib vmx driver in 12.0 (30.5 Gbps [Test 8] and 31.5 Gbps [Test 4]).

(3) There are LRO-related bugs in both the pre-iflib vmx driver in 12.0
(see Test 3) and the iflib vmx driver in 12.1 (see Test 7), they just
surface differently.

The categories of root causes for bugs and performance issues are: bugs in
the vmx driver, bugs in iflib, and behavioral variations across the many
fielded versions of the VMXNET3 virtual device.  Indeed, all of these
categories have been encountered in the past year.  Also, there is a rich
set of driver configuration and operating environment parameters, which
makes advancing the overall robustness of the driver (instead of just
shifting issues into or out of one's own operating parameter space) an
arduous task.

I think the right way to approach this is to continue to fill out the test
matrix and root cause and resolve all of the issues encountered, rather
than argue for reverting to the old driver out of frustration based on a
narrow set of (so far, rather poorly characterized) circumstances.  I'm in
a position to do this, from the standpoint of substantial knowledge of the
vmx driver and virtual device, as well as of iflib internals, and I will be
doing this, as non-work cycles become available.

Best,
Patrick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAD44qMWuV=ZNYoKNR5gy9UR07=gu-Hkvg38tG7pjs9xZFBNsxA>