Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Feb 2020 00:07:33 -0500
From:      Patrick Kelsey <pkelsey@freebsd.org>
To:        Josh Paetzel <jpaetzel@freebsd.org>, Andriy Gapon <avg@freebsd.org>
Cc:        freebsd-net <freebsd-net@freebsd.org>
Subject:   Re: terrible if_vmx / vmxnet3 rx performance with lro (post iflib)
Message-ID:  <CAD44qMXMi_NVXo5zwRCqNZatD-k=mGEOAvGB0URqu3YK8uMvZw@mail.gmail.com>
In-Reply-To: <CAD44qMWuV=ZNYoKNR5gy9UR07=gu-Hkvg38tG7pjs9xZFBNsxA@mail.gmail.com>
References:  <40c4a4df-3df6-d95d-53c2-eef905ff45b1@FreeBSD.org> <5e5d423b-0711-7454-626a-cc9cb4b004cd@FreeBSD.org> <ec5d9134-7642-413a-bf20-f15012bc332c@www.fastmail.com> <CAD44qMWuV=ZNYoKNR5gy9UR07=gu-Hkvg38tG7pjs9xZFBNsxA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 24, 2020 at 11:40 PM Patrick Kelsey <pkelsey@freebsd.org> wrote:

>
>
> On Thu, Feb 20, 2020 at 4:58 PM Josh Paetzel <jpaetzel@freebsd.org> wrote:
>
>>
>>
>> On Wed, Feb 19, 2020, at 7:17 AM, Andriy Gapon wrote:
>> > On 18/02/2020 16:09, Andriy Gapon wrote:
>> > > My general experience with post-iflib vmxnet3 is that vmxnet3 has some
>> > > peculiarities that result in a certain "impedance mismatch" with
>> iflib.
>> > > Although we now have a bit less code and it is a bit more regular,
>> there are a
>> > > few significant (for us, at least) problems:
>> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243126
>> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240608
>> >
>> > By the way, we (Panzura) use these changes to fix or work around the
>> above two
>> > problems: https://people.freebsd.org/~avg/iflib-vmx.pz.diff
>> >
>> > Questions / comments are welcome.
>> > Especially from people who worked on iflib.
>> >
>> > > - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243392
>> > > - the problem described above
>> > > - a couple of issues that we already fixed or worked around
>> > >
>> > > We are contemplating locally reverting to the pre-iflib vmxnet3 and
>> we are
>> > > wondering if the conversion was really worth it in general.
>> >
>> >
>> > --
>> > Andriy Gapon
>> > _______________________________________________
>> > freebsd-net@freebsd.org mailing list
>> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>> >
>>
>> I'd like to follow this up just to make it 100% clear.  The problem is a
>> ~4x regression in RX performance.  It affects stock FreeBSD, including
>> 12.1-RELEASE.
>>
>> In my 40Gbps connected lab single thread iperf receive went from 9Gbps to
>> 2.5Gbps.
>>
>> If this can't be fixed or looked at I'd heavily suggest looking at
>> reverting "iflib"ing change in stock FreeBSD.
>>
>>
> Consider these datapoints I collected this evening:
>
> Hypervisor: ESXi 6.7.0 Build 8169922
> Hardware: Xeon E5-1650 v3 @ 3.50GHz (6 physical cores, HT disabled)
>
> iperf3 client: a VM on the same vswitch as the VM under test, running
> Ubuntu 18.04.3 LTS with 2 vCPUs, 4GB RAM, and a VMXNET3 interface used only
> for traffic to the VM under test (this VMXNET3 has checksum offload,
> TSO/GSO, and LRO/GRO enabled)
> iperf3 server: running on the VM under test, either a 12.0-RELEASE VM
> (this is before the vmx iflib conversion), or a 12.1-RELEASE VM (this is
> after the vmx iflib conversion) with r356703 applied (the recent TSO bug
> fix).  Both VMs have 3 vCPUs, but the vmx interface only uses 1 tx and 1 rx
> queue, as hw.pci.honor_msi_blacklist is at its default of 0, so MSI is used.
>
>
> Test 1: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO
> enabled, LRO disabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.0 VM IP> -p 1234
> Connecting to host <12.0 VM IP>, port 1234
> [  4] local <Ubuntu VM IP> port 44664 connected to <12.0 VM IP> port 1234
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-1.00   sec  1.11 GBytes  9.52 Gbits/sec  1144    529 KBytes
> [  4]   1.00-2.00   sec  1.09 GBytes  9.40 Gbits/sec  1272    369 KBytes
> [  4]   2.00-3.00   sec  1.11 GBytes  9.51 Gbits/sec  1249    344 KBytes
> [  4]   3.00-4.00   sec  1.06 GBytes  9.12 Gbits/sec  1973    369 KBytes
> [  4]   4.00-5.00   sec  1.11 GBytes  9.50 Gbits/sec  1860    370 KBytes
> [  4]   5.00-6.00   sec  1.08 GBytes  9.28 Gbits/sec  1342    396 KBytes
> [  4]   6.00-7.00   sec  1.09 GBytes  9.38 Gbits/sec  1278    563 KBytes
> [  4]   7.00-8.00   sec  1.05 GBytes  8.99 Gbits/sec  1226    372 KBytes
> [  4]   8.00-9.00   sec  1.03 GBytes  8.87 Gbits/sec  1145    400 KBytes
> [  4]   9.00-10.00  sec  1.08 GBytes  9.28 Gbits/sec  1317    354 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  10.8 GBytes  9.28 Gbits/sec  13806
> sender
> [  4]   0.00-10.00  sec  10.8 GBytes  9.28 Gbits/sec
>  receiver
>
>
> Test 2: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO
> enabled, LRO enabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=60079b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.0 VM IP> -p 1234
> Connecting to host <12.0 VM IP>, port 1234
> [  4] local <Ubuntu VM IP> port 44714 connected to <12.0 VM IP> port 1234
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-1.00   sec  3.48 GBytes  29.9 Gbits/sec    0    887 KBytes
> [  4]   1.00-2.00   sec  1.93 GBytes  16.6 Gbits/sec    0    994 KBytes
> [  4]   2.00-3.00   sec  2.03 GBytes  17.5 Gbits/sec    0   1.10 MBytes
> [  4]   3.00-4.00   sec  1.99 GBytes  17.1 Gbits/sec    0   1.10 MBytes
> [  4]   4.00-5.00   sec  2.00 GBytes  17.1 Gbits/sec    0   1.10 MBytes
> [  4]   5.00-6.00   sec  1.93 GBytes  16.6 Gbits/sec    0   1.10 MBytes
> [  4]   6.00-7.00   sec  2.04 GBytes  17.5 Gbits/sec    0   1.10 MBytes
> [  4]   7.00-8.00   sec  2.01 GBytes  17.3 Gbits/sec    0   1.10 MBytes
> [  4]   8.00-9.00   sec  1.97 GBytes  16.9 Gbits/sec    0   1.10 MBytes
> [  4]   9.00-10.00  sec  1.98 GBytes  17.0 Gbits/sec    0   1.10 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  21.4 GBytes  18.3 Gbits/sec    0
> sender
> [  4]   0.00-10.00  sec  21.4 GBytes  18.3 Gbits/sec
>  receiver
>
>
> Test 3: 12.0-RELEASE, single TCP stream receive, standard mtu, TSO
> enabled, LRO disabled (LRO disabled and test run after Test 2 above)
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.0 VM IP> -p 1234
> Connecting to host <12.0 VM IP>, port 1234
> [  4] local <Ubuntu VM IP> port 44718 connected to <12.0 VM IP> port 1234
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-1.00   sec  1.14 GBytes  9.76 Gbits/sec  1871    338 KBytes
> [  4]   1.00-2.00   sec   483 MBytes  4.05 Gbits/sec  1307   1.41 KBytes
> [  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> [  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  4]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> [  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> [  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> [  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  1.61 GBytes  1.38 Gbits/sec  3181
> sender
> [  4]   0.00-10.00  sec  1.60 GBytes  1.38 Gbits/sec
>  receiver
>
>
> Test 4: 12.0-RELEASE, single TCP stream transmit, standard mtu, TSO
> enabled, LRO enabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=60079b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -R -c <12.0 VM IP> -p 1234
> Connecting to host <12.0 VM IP>, port 1234
> Reverse mode, remote host <12.0 VM IP> is sending
> [  4] local <Ubuntu VM IP> port 44726 connected to <12.0 VM IP> port 1234
> [ ID] Interval           Transfer     Bandwidth
> [  4]   0.00-1.00   sec  4.28 GBytes  36.8 Gbits/sec
> [  4]   1.00-2.00   sec  3.31 GBytes  28.4 Gbits/sec
> [  4]   2.00-3.00   sec  3.85 GBytes  33.1 Gbits/sec
> [  4]   3.00-4.00   sec  4.24 GBytes  36.5 Gbits/sec
> [  4]   4.00-5.00   sec  3.16 GBytes  27.1 Gbits/sec
> [  4]   5.00-6.00   sec  3.54 GBytes  30.4 Gbits/sec
> [  4]   6.00-7.00   sec  4.03 GBytes  34.6 Gbits/sec
> [  4]   7.00-8.00   sec  2.93 GBytes  25.1 Gbits/sec
> [  4]   8.00-9.00   sec  3.42 GBytes  29.4 Gbits/sec
> [  4]   9.00-10.00  sec  3.93 GBytes  33.8 Gbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  36.7 GBytes  31.5 Gbits/sec  280
> sender
> [  4]   0.00-10.00  sec  36.7 GBytes  31.5 Gbits/sec
>  receiver
>
>
> Test 5: 12.1-RELEASE with r356703 applied, single stream receive, standard
> mtu, TSO enabled, LRO disabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.1 VM IP> -p 1234
> Connecting to host <12.1 VM IP>, port 1234
> [  4] local <Ubuntu VM IP> port 48392 connected to <12.1 VM IP> port 1234
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-1.00   sec   828 MBytes  6.95 Gbits/sec  1247    335 KBytes
> [  4]   1.00-2.00   sec   901 MBytes  7.56 Gbits/sec  1841    345 KBytes
> [  4]   2.00-3.00   sec   909 MBytes  7.62 Gbits/sec  1805    356 KBytes
> [  4]   3.00-4.00   sec   909 MBytes  7.62 Gbits/sec  2337    322 KBytes
> [  4]   4.00-5.00   sec   907 MBytes  7.61 Gbits/sec  1834    354 KBytes
> [  4]   5.00-6.00   sec   907 MBytes  7.61 Gbits/sec  1984    352 KBytes
> [  4]   6.00-7.00   sec   909 MBytes  7.62 Gbits/sec  2189    329 KBytes
> [  4]   7.00-8.00   sec   908 MBytes  7.62 Gbits/sec  2000    338 KBytes
> [  4]   8.00-9.00   sec   907 MBytes  7.61 Gbits/sec  2006    315 KBytes
> [  4]   9.00-10.00  sec   908 MBytes  7.61 Gbits/sec  1764    332 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  8.78 GBytes  7.54 Gbits/sec  19007
> sender
> [  4]   0.00-10.00  sec  8.78 GBytes  7.54 Gbits/sec
>  receiver
>
>
> Test 6: 12.1-RELEASE with r356703 applied, single stream receive, standard
> mtu, TSO enabled, LRO disabled, sysctl dev.vmx.0.iflib.tx_abdicate=1
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.1 VM IP> -p 1234
> Connecting to host <12.1 VM IP>, port 1234
> [  4] local <Ubuntu VM IP> port 48416 connected to <12.1 VM IP> port 1234
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-1.00   sec  1.29 GBytes  11.1 Gbits/sec  3016    290 KBytes
> [  4]   1.00-2.00   sec  1.33 GBytes  11.4 Gbits/sec  4133    322 KBytes
> [  4]   2.00-3.00   sec  1.34 GBytes  11.5 Gbits/sec  5409    335 KBytes
> [  4]   3.00-4.00   sec  1.35 GBytes  11.6 Gbits/sec  3899    376 KBytes
> [  4]   4.00-5.00   sec  1.35 GBytes  11.6 Gbits/sec  4609    300 KBytes
> [  4]   5.00-6.00   sec  1.35 GBytes  11.6 Gbits/sec  4603    303 KBytes
> [  4]   6.00-7.00   sec  1.36 GBytes  11.7 Gbits/sec  4417    293 KBytes
> [  4]   7.00-8.00   sec  1.34 GBytes  11.5 Gbits/sec  5680    290 KBytes
> [  4]   8.00-9.00   sec  1.33 GBytes  11.5 Gbits/sec  5461    359 KBytes
> [  4]   9.00-10.00  sec  1.03 GBytes  8.86 Gbits/sec  5060    329 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  13.1 GBytes  11.2 Gbits/sec  46287
> sender
> [  4]   0.00-10.00  sec  13.1 GBytes  11.2 Gbits/sec
>  receiver
>
>
> Test 7: 12.1-RELEASE with r356703 applied, single stream receive, standard
> mtu, TSO enabled, LRO enabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -c <12.1 VM IP> -p 1234
> Connecting to host <12.1 VM IP>, port 1234
> [  4] local <Ubuntu VM IP> port 48396 connected to <12.1 VM IP> port 1234
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-1.00   sec  98.5 MBytes   826 Mbits/sec  129   2.83 KBytes
> [  4]   1.00-2.00   sec  63.6 KBytes   521 Kbits/sec   25   2.83 KBytes
> [  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec   25   2.83 KBytes
> [  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec   16   2.83 KBytes
> [  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec   15   2.83 KBytes
> [  4]   5.00-6.00   sec  63.6 KBytes   521 Kbits/sec   15   2.83 KBytes
> [  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec   15   2.83 KBytes
> [  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec   12   2.83 KBytes
> [  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec   15   2.83 KBytes
> [  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec   11   1.41 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  98.7 MBytes  82.8 Mbits/sec  278
> sender
> [  4]   0.00-10.00  sec  97.8 MBytes  82.0 Mbits/sec
>  receiver
>
>
> Test 8: 12.1-RELEASE with r356703 applied, single stream transmit,
> standard mtu, TSO enabled, LRO disabled
> ======
> vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
> options=e403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> $ iperf3 -R -c <12.1 VM IP> -p 1234
> Connecting to host <12.1 VM IP>, port 1234
> Reverse mode, remote host <12.1 VM IP> is sending
> [  4] local <Ubuntu VM IP> port 48400 connected to <12.1 VM IP> port 1234
> [ ID] Interval           Transfer     Bandwidth
> [  4]   0.00-1.00   sec  4.25 GBytes  36.5 Gbits/sec
> [  4]   1.00-2.00   sec  3.29 GBytes  28.3 Gbits/sec
> [  4]   2.00-3.00   sec  3.61 GBytes  31.0 Gbits/sec
> [  4]   3.00-4.00   sec  3.93 GBytes  33.8 Gbits/sec
> [  4]   4.00-5.00   sec  4.17 GBytes  35.8 Gbits/sec
> [  4]   5.00-6.00   sec  3.53 GBytes  30.3 Gbits/sec
> [  4]   6.00-7.00   sec  3.22 GBytes  27.7 Gbits/sec
> [  4]   7.00-8.00   sec  3.90 GBytes  33.5 Gbits/sec
> [  4]   8.00-9.00   sec  2.80 GBytes  24.1 Gbits/sec
> [  4]   9.00-10.00  sec  2.78 GBytes  23.9 Gbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-10.00  sec  35.5 GBytes  30.5 Gbits/sec  571
> sender
> [  4]   0.00-10.00  sec  35.5 GBytes  30.5 Gbits/sec
>  receiver
>
>
> Based on the above, it looks like:
>
> (1) The non-LRO single-stream TCP receive performance of the iflib vmx
> driver in 12.1 release lags behind the non-LRO single-stream TCP receive
> performance of the pre-iflib vmx driver in 12.0 (by about 20%, 7.54 Gbps
> [Test 5] vs 9.28 Gbps [Test 1]), unless tx_abdicate is enabled, in which
> case the vmx driver performs better (by about 20%, 11.2 Gbps [Test 6] vs
> 9.28 Gbps [Test 1]).
>
> (2) The TSO-enabled single-stream TCP send performance of the iflib vmx
> driver in 12.1 release (with TSO bug patch applied) is at parity with the
> pre-iflib vmx driver in 12.0 (30.5 Gbps [Test 8] and 31.5 Gbps [Test 4]).
>
> (3) There are LRO-related bugs in both the pre-iflib vmx driver in 12.0
> (see Test 3) and the iflib vmx driver in 12.1 (see Test 7), they just
> surface differently.
>
> The categories of root causes for bugs and performance issues are: bugs in
> the vmx driver, bugs in iflib, and behavioral variations across the many
> fielded versions of the VMXNET3 virtual device.  Indeed, all of these
> categories have been encountered in the past year.  Also, there is a rich
> set of driver configuration and operating environment parameters, which
> makes advancing the overall robustness of the driver (instead of just
> shifting issues into or out of one's own operating parameter space) an
> arduous task.
>
> I think the right way to approach this is to continue to fill out the test
> matrix and root cause and resolve all of the issues encountered, rather
> than argue for reverting to the old driver out of frustration based on a
> narrow set of (so far, rather poorly characterized) circumstances.  I'm in
> a position to do this, from the standpoint of substantial knowledge of the
> vmx driver and virtual device, as well as of iflib internals, and I will be
> doing this, as non-work cycles become available.
>
>
I spent a bit of time poking at this, and I believe I have root caused all
of the reported issues and developed patches (to both iflib and the vmx
driver) that solve them.  My test system running 12.1 with these patches
applied (as well as the TSO patch) operates correctly with and without TSO
and/or LRO enabled, and with large MTU values.  It exhibits throughput
performance parity or better compared to the pre-iflib driver for the
single-core / single-stream tests that I am currently using to assess
correctness.

The primary issue (that resulted in the reported free-list related
assertion failures, use-after-free panics, trouble related to jumbo frames,
and trouble with LRO) was that both the vmx driver and iflib needed to be
fixed in order to correctly handle the case where the vmx virtual device
skips descriptors.  It's not known why the virtual device sometimes skips
descriptors, but this seems to occur frequently, at least under ESXi, when
packets span multiple descriptors.

A secondary issue was fixed (secondary in that it impacts performance but
not correctness) in which the vmx driver was only ever using cluster-sized
receive buffers regardless of the MTU, instead of switching to page-sized
buffers when the MTU is sufficiently large.

There remains an open question as to whether the vmx virtual device
consumes a buffer descriptor or not when the completion descriptor
indicates zero length.  So far I haven't been able to cause zero-length
completions to occur.

There also remains a concept fail in iflib concerning the refill of receive
descriptor rings that can be worked around, to a point, with a sysctl, but
that at some point needs to be fixed properly.  iflib limits the number of
received packets it will process during a receive interrupt according to a
budget value, and then it also limits the number of receive descriptors it
will refill according to that same budget value (with a magic constant
added to it).  Generally, packets can span multiple descriptors, and
limiting the refill to essentially the number of packets processed
completely fails to address this multiplicity, resulting in terrible
performance degradation when multi-segment packets are in heavy use (e.g.,
with LRO or large MTUs).

It will take a bit more time to write up all the associated details, post
the patches for review, and update the bugs.  I think avg@ will recognize
in those details the completion of a number of thoughts that he had while
trying to debug this.

I also think the TSO patch, as well as the correctness fixes noted above,
should at some point wind up in an errata release for 12.1.

-Patrick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAD44qMXMi_NVXo5zwRCqNZatD-k=mGEOAvGB0URqu3YK8uMvZw>