Date: Mon, 13 Jun 2022 14:25:36 -0400 From: Mike Jakubik <mike.jakubik@swiftsmsgateway.com> To: "freebsd-net" <freebsd-net@FreeBSD.org> Subject: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5) Message-ID: <1815e506878.cf301a5a1195924.6506017618978817828@swiftsmsgateway.com>
next in thread | raw e-mail | index | archive | help
------=_Part_3768786_1051015825.1655144736888 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello, I have two new servers with a Mellnox ConnectX-6 card linked at 25Gb/s, how= ever, I am unable to get much more than 6Gb/s when testing with iperf3. The servers are Lenovo SR665 (2 x AMD EPYC 7443 24-Core Processor, 256 GB R= AM, Mellanox ConnectX-6 Lx 10/25GbE SFP28 2-port OCP Ethernet Adapter) They are connected to a Dell=C2=A0N3224PX-ON switch. Both servers are idle = and not in use, with a fresh install of=C2=A0stable/13-ebea872f8, nothing r= unning on them except ssh, sendmail, etc. When i test with iperf3 I am unable to get a higher avg than about 6Gb/s. I= have tried just about every knob listed in=C2=A0https://calomel.org/freebs= d_network_tuning.html=C2=A0with little impact on the performance. The netwo= rk cards have HW LRO enabled as per the driver documentation (though this o= nly seems to lower IRQ usage with no impact on actual throughput). The same exact servers tested on Linux (fedora 34) produced nearly 3x the p= erformance (see attached screenshots), i was able to get a steady 14.6Gb/s = rate with nearly 0 retries shown in iperf, the performance on FreeBSD seems= to avg at around 6Gbs but it is very sporadic during the iperf run. I have run out of ideas, any suggestions are welcome. Considering Netflix u= ses very similar HW and they push 400 Gb/s tells me there is something real= ly wrong here or Netflix isnt sharing all their secret sauce. # ifconfig mce0 mce0: flags=3D8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 150= 0=20 options=3Dffed07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWC= SUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV= 6,TXRTLMT,HWRXTSTMP,NOMAP,TXTLS4,TXTLS6,VXLAN_HWCSUM,VXLAN_HWTSO,TXTLS_RTLM= T> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ether b8:ce:f6:81:df:6a =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 inet 192.168.10.31 netmask 0xfff= fff00 broadcast 192.168.10.255 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 media: Ethernet 25GBase-CR <full= -duplex,rxpause,txpause> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 status: active =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 nd6 options=3D29<PERFORMNUD,IFDI= SABLED,AUTO_LINKLOCAL> [root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01=20 Connecting to host db-01, port 5201 [=C2=A0 5] local 192.168.10.31 port 64695 connected to 192.168.10.30 port 5= 201 [ ID] Interval=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = Transfer=C2=A0=C2=A0=C2=A0=C2=A0 Bitrate=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 Retr=C2=A0 Cwnd [=C2=A0 5]=C2=A0=C2=A0 0.00-1.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 930 MBytes=C2= =A0 7.80 Gbits/sec=C2=A0=C2=A0 62=C2=A0=C2=A0=C2=A0 789 KBytes [=C2=A0 5]=C2=A0=C2=A0 1.00-2.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 942 MBytes=C2= =A0 7.90 Gbits/sec=C2=A0 164=C2=A0=C2=A0=C2=A0 824 KBytes [=C2=A0 5]=C2=A0=C2=A0 2.00-3.00=C2=A0=C2=A0 sec=C2=A0 1.00 GBytes=C2=A0 8.= 61 Gbits/sec=C2=A0 402=C2=A0=C2=A0=C2=A0 879 KBytes [=C2=A0 5]=C2=A0=C2=A0 3.00-4.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 761 MBytes=C2= =A0 6.39 Gbits/sec=C2=A0=C2=A0 61=C2=A0=C2=A0=C2=A0 588 KBytes [=C2=A0 5]=C2=A0=C2=A0 4.00-5.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 724 MBytes=C2= =A0 6.07 Gbits/sec=C2=A0 220=C2=A0=C2=A0=C2=A0 497 KBytes [=C2=A0 5]=C2=A0=C2=A0 5.00-6.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 723 MBytes=C2= =A0 6.07 Gbits/sec=C2=A0=C2=A0 54=C2=A0=C2=A0=C2=A0 364 KBytes [=C2=A0 5]=C2=A0=C2=A0 6.00-7.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 716 MBytes=C2= =A0 6.01 Gbits/sec=C2=A0 187=C2=A0=C2=A0=C2=A0 682 KBytes [=C2=A0 5]=C2=A0=C2=A0 7.00-8.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 728 MBytes=C2= =A0 6.11 Gbits/sec=C2=A0=C2=A0 86=C2=A0=C2=A0=C2=A0 568 KBytes [=C2=A0 5]=C2=A0=C2=A0 8.00-9.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 761 MBytes=C2= =A0 6.39 Gbits/sec=C2=A0=C2=A0 37=C2=A0=C2=A0=C2=A0 418 KBytes [=C2=A0 5]=C2=A0=C2=A0 9.00-10.00=C2=A0 sec=C2=A0=C2=A0 733 MBytes=C2=A0 6.= 15 Gbits/sec=C2=A0=C2=A0=C2=A0 8=C2=A0=C2=A0=C2=A0 617 KBytes [=C2=A0 5]=C2=A0 10.00-11.00=C2=A0 sec=C2=A0=C2=A0 734 MBytes=C2=A0 6.16 Gb= its/sec=C2=A0 238=C2=A0=C2=A0=C2=A0 474 KBytes [=C2=A0 5]=C2=A0 11.00-12.00=C2=A0 sec=C2=A0=C2=A0 736 MBytes=C2=A0 6.17 Gb= its/sec=C2=A0 164=C2=A0=C2=A0=C2=A0 757 KBytes [=C2=A0 5]=C2=A0 12.00-13.00=C2=A0 sec=C2=A0=C2=A0 610 MBytes=C2=A0 5.12 Gb= its/sec=C2=A0 118=C2=A0=C2=A0=C2=A0 579 KBytes [=C2=A0 5]=C2=A0 13.00-14.00=C2=A0 sec=C2=A0 1.02 GBytes=C2=A0 8.75 Gbits/s= ec=C2=A0 447=C2=A0=C2=A0=C2=A0 449 KBytes [=C2=A0 5]=C2=A0 14.00-15.00=C2=A0 sec=C2=A0=C2=A0 728 MBytes=C2=A0 6.11 Gb= its/sec=C2=A0 132=C2=A0=C2=A0=C2=A0 719 KBytes [=C2=A0 5]=C2=A0 15.00-16.00=C2=A0 sec=C2=A0=C2=A0 724 MBytes=C2=A0 6.07 Gb= its/sec=C2=A0 185=C2=A0=C2=A0=C2=A0 649 KBytes [=C2=A0 5]=C2=A0 16.00-17.00=C2=A0 sec=C2=A0=C2=A0 597 MBytes=C2=A0 5.01 Gb= its/sec=C2=A0 142=C2=A0=C2=A0=C2=A0 570 KBytes [=C2=A0 5]=C2=A0 17.00-18.00=C2=A0 sec=C2=A0=C2=A0 733 MBytes=C2=A0 6.15 Gb= its/sec=C2=A0 102=C2=A0=C2=A0=C2=A0 484 KBytes [=C2=A0 5]=C2=A0 18.00-19.00=C2=A0 sec=C2=A0=C2=A0 726 MBytes=C2=A0 6.09 Gb= its/sec=C2=A0=C2=A0 15=C2=A0=C2=A0=C2=A0 569 KBytes [=C2=A0 5]=C2=A0 19.00-20.00=C2=A0 sec=C2=A0=C2=A0 733 MBytes=C2=A0 6.15 Gb= its/sec=C2=A0 181=C2=A0=C2=A0=C2=A0 527 KBytes [=C2=A0 5]=C2=A0 20.00-21.00=C2=A0 sec=C2=A0=C2=A0 729 MBytes=C2=A0 6.12 Gb= its/sec=C2=A0 118=C2=A0=C2=A0=C2=A0 430 KBytes [=C2=A0 5]=C2=A0 21.00-22.00=C2=A0 sec=C2=A0=C2=A0 733 MBytes=C2=A0 6.15 Gb= its/sec=C2=A0 116=C2=A0=C2=A0=C2=A0 641 KBytes [=C2=A0 5]=C2=A0 22.00-23.00=C2=A0 sec=C2=A0=C2=A0 728 MBytes=C2=A0 6.10 Gb= its/sec=C2=A0 182=C2=A0=C2=A0=C2=A0 598 KBytes [=C2=A0 5]=C2=A0 23.00-24.00=C2=A0 sec=C2=A0=C2=A0 743 MBytes=C2=A0 6.24 Gb= its/sec=C2=A0 209=C2=A0=C2=A0=C2=A0 614 KBytes [=C2=A0 5]=C2=A0 24.00-25.00=C2=A0 sec=C2=A0=C2=A0 746 MBytes=C2=A0 6.26 Gb= its/sec=C2=A0=C2=A0 72=C2=A0=C2=A0=C2=A0 758 KBytes [=C2=A0 5]=C2=A0 25.00-26.00=C2=A0 sec=C2=A0=C2=A0 742 MBytes=C2=A0 6.23 Gb= its/sec=C2=A0 199=C2=A0=C2=A0=C2=A0 675 KBytes [=C2=A0 5]=C2=A0 26.00-27.00=C2=A0 sec=C2=A0=C2=A0 799 MBytes=C2=A0 6.70 Gb= its/sec=C2=A0 183=C2=A0=C2=A0=C2=A0 542 KBytes [=C2=A0 5]=C2=A0 27.00-28.00=C2=A0 sec=C2=A0=C2=A0 908 MBytes=C2=A0 7.61 Gb= its/sec=C2=A0=C2=A0=C2=A0 7=C2=A0=C2=A0 1.19 MBytes [=C2=A0 5]=C2=A0 28.00-29.00=C2=A0 sec=C2=A0 1.37 GBytes=C2=A0 11.7 Gbits/s= ec=C2=A0 606=C2=A0=C2=A0 1013 KBytes [=C2=A0 5]=C2=A0 29.00-30.00=C2=A0 sec=C2=A0 1.31 GBytes=C2=A0 11.3 Gbits/s= ec=C2=A0=C2=A0 74=C2=A0=C2=A0 1.02 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = Transfer=C2=A0=C2=A0=C2=A0=C2=A0 Bitrate=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 Retr [=C2=A0 5]=C2=A0=C2=A0 0.00-30.00=C2=A0 sec=C2=A0 23.7 GBytes=C2=A0 6.79 Gb= its/sec=C2=A0 4771=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 sender [=C2=A0 5]=C2=A0=C2=A0 0.00-30.00=C2=A0 sec=C2=A0 23.7 GBytes=C2=A0 6.79 Gb= its/sec=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 receiver I have even tried changing to the RACK TCP stack, only to get slightly bett= er results, however with RACK the amount of retries is nearly 0. [root@db-02 ~]# sysctl net.inet.tcp.functions_default=3Drack=20 net.inet.tcp.functions_default: rack -> rack [root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 [root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 Connecting to host db-01, port 5201 [=C2=A0 5] local 192.168.10.31 port 51894 connected to 192.168.10.30 port 5= 201 [ ID] Interval=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = Transfer=C2=A0=C2=A0=C2=A0=C2=A0 Bitrate=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 Retr=C2=A0 Cwnd [=C2=A0 5]=C2=A0=C2=A0 0.00-1.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 761 MBytes=C2= =A0 6.38 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 737 KBytes [=C2=A0 5]=C2=A0=C2=A0 1.00-2.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 859 MBytes=C2= =A0 7.21 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 761 KBytes [=C2=A0 5]=C2=A0=C2=A0 2.00-3.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 880 MBytes=C2= =A0 7.38 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 785 KBytes [=C2=A0 5]=C2=A0=C2=A0 3.00-4.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 734 MBytes=C2= =A0 6.16 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 804 KBytes [=C2=A0 5]=C2=A0=C2=A0 4.00-5.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 777 MBytes=C2= =A0 6.52 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 824 KBytes [=C2=A0 5]=C2=A0=C2=A0 5.00-6.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 719 MBytes=C2= =A0 6.03 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 841 KBytes [=C2=A0 5]=C2=A0=C2=A0 6.00-7.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 865 MBytes=C2= =A0 7.26 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 862 KBytes [=C2=A0 5]=C2=A0=C2=A0 7.00-8.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 880 MBytes=C2= =A0 7.38 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 882 KBytes [=C2=A0 5]=C2=A0=C2=A0 8.00-9.00=C2=A0=C2=A0 sec=C2=A0=C2=A0 906 MBytes=C2= =A0 7.60 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 904 KBytes [=C2=A0 5]=C2=A0=C2=A0 9.00-10.00=C2=A0 sec=C2=A0=C2=A0 749 MBytes=C2=A0 6.= 29 Gbits/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 921 KBytes [=C2=A0 5]=C2=A0 10.00-11.00=C2=A0 sec=C2=A0=C2=A0 798 MBytes=C2=A0 6.69 Gb= its/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 938 KBytes [=C2=A0 5]=C2=A0 11.00-12.00=C2=A0 sec=C2=A0=C2=A0 746 MBytes=C2=A0 6.26 Gb= its/sec=C2=A0 209=C2=A0=C2=A0=C2=A0 772 KBytes [=C2=A0 5]=C2=A0 12.00-13.00=C2=A0 sec=C2=A0=C2=A0 768 MBytes=C2=A0 6.44 Gb= its/sec=C2=A0=C2=A0 35=C2=A0=C2=A0=C2=A0 644 KBytes [=C2=A0 5]=C2=A0 13.00-14.00=C2=A0 sec=C2=A0=C2=A0 948 MBytes=C2=A0 7.95 Gb= its/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 673 KBytes [=C2=A0 5]=C2=A0 14.00-15.00=C2=A0 sec=C2=A0 1.23 GBytes=C2=A0 10.5 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 711 KBytes [=C2=A0 5]=C2=A0 15.00-16.00=C2=A0 sec=C2=A0 1.32 GBytes=C2=A0 11.4 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 748 KBytes [=C2=A0 5]=C2=A0 16.00-17.00=C2=A0 sec=C2=A0 1.31 GBytes=C2=A0 11.2 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 785 KBytes [=C2=A0 5]=C2=A0 17.00-18.00=C2=A0 sec=C2=A0 1.29 GBytes=C2=A0 11.1 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 819 KBytes [=C2=A0 5]=C2=A0 18.00-19.00=C2=A0 sec=C2=A0 1.30 GBytes=C2=A0 11.2 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 852 KBytes [=C2=A0 5]=C2=A0 19.00-20.00=C2=A0 sec=C2=A0 1.34 GBytes=C2=A0 11.5 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 883 KBytes [=C2=A0 5]=C2=A0 20.00-21.00=C2=A0 sec=C2=A0 1.29 GBytes=C2=A0 11.1 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 914 KBytes [=C2=A0 5]=C2=A0 21.00-22.00=C2=A0 sec=C2=A0 1.36 GBytes=C2=A0 11.7 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 944 KBytes [=C2=A0 5]=C2=A0 22.00-23.00=C2=A0 sec=C2=A0 1.33 GBytes=C2=A0 11.4 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0 974 KBytes [=C2=A0 5]=C2=A0 23.00-24.00=C2=A0 sec=C2=A0 1.31 GBytes=C2=A0 11.2 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 1003 KBytes [=C2=A0 5]=C2=A0 24.00-25.00=C2=A0 sec=C2=A0 1.30 GBytes=C2=A0 11.2 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 1.00 MBytes [=C2=A0 5]=C2=A0 25.00-26.00=C2=A0 sec=C2=A0 1.34 GBytes=C2=A0 11.5 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 1.03 MBytes [=C2=A0 5]=C2=A0 26.00-27.00=C2=A0 sec=C2=A0 1.32 GBytes=C2=A0 11.3 Gbits/s= ec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 1.06 MBytes [=C2=A0 5]=C2=A0 27.00-28.00=C2=A0 sec=C2=A0=C2=A0 957 MBytes=C2=A0 8.03 Gb= its/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 1.07 MBytes [=C2=A0 5]=C2=A0 28.00-29.00=C2=A0 sec=C2=A0=C2=A0 837 MBytes=C2=A0 7.02 Gb= its/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 1.09 MBytes [=C2=A0 5]=C2=A0 29.00-30.00=C2=A0 sec=C2=A0=C2=A0 729 MBytes=C2=A0 6.11 Gb= its/sec=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0 1.10 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = Transfer=C2=A0=C2=A0=C2=A0=C2=A0 Bitrate=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 Retr [=C2=A0 5]=C2=A0=C2=A0 0.00-30.00=C2=A0 sec=C2=A0 30.6 GBytes=C2=A0 8.77 Gb= its/sec=C2=A0 244=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 sender [=C2=A0 5]=C2=A0=C2=A0 0.00-30.00=C2=A0 sec=C2=A0 30.6 GBytes=C2=A0 8.77 Gb= its/sec=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 receiver More data can be found @=C2=A0https://forums.freebsd.org/threads/poor-perfo= rmance-with-stable-13-and-mellanox-connectx-6-mlx5.85460/ Mike Jakubik https://www.swiftsmsgateway.com/ Disclaimer: This e-mail and any attachments are intended only for the use o= f the addressee(s) and may contain information that is privileged or confid= ential. If you are not the intended recipient, or responsible for deliverin= g the information to the intended recipient, you are hereby notified that a= ny dissemination, distribution, printing or copying of this e-mail and any = attachments is strictly prohibited. If this e-mail and any attachments were= received in error, please notify the sender by reply e-mail and delete the= original message. ------=_Part_3768786_1051015825.1655144736888 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head>= <meta content=3D"text/html;charset=3DUTF-8" http-equiv=3D"Content-Type"></h= ead><body ><div style=3D"font-family: Verdana, Arial, Helvetica, sans-serif= ; font-size: 10pt;"><div><div style=3D"font-family: Verdana, Arial, Helveti= ca, sans-serif;font-size: 10.0pt;"><div>Hello,<br></div><div><br></div><div= >I have two new servers with a Mellnox ConnectX-6 card linked at 25Gb/s, ho= wever, I am unable to get much more than 6Gb/s when testing with iperf3.<br= ></div><div><br></div><div>The servers are Lenovo SR665 (2 x AMD EPYC 7443 = 24-Core Processor, 256 GB RAM, Mellanox ConnectX-6 Lx 10/25GbE SFP28 2-port= OCP Ethernet Adapter)<br></div><div><br></div><div>They are connected to a= Dell N3224PX-ON switch. Both servers are idle and not in use, with a = fresh install of stable/13-ebea872f8, nothing running on them except s= sh, sendmail, etc.<br></div><div><br></div><div>When i test with iperf3 I a= m unable to get a higher avg than about 6Gb/s. I have tried just about ever= y knob listed in <a target=3D"_blank" href=3D"https://calomel.org/free= bsd_network_tuning.html">https://calomel.org/freebsd_network_tuning.html</a= > with little impact on the performance. The network cards have HW LRO= enabled as per the driver documentation (though this only seems to lower I= RQ usage with no impact on actual throughput).<br></div><div><br></div><div= >The same exact servers tested on Linux (fedora 34) produced nearly 3x the = performance (see attached screenshots), i was able to get a steady 14.6Gb/s= rate with nearly 0 retries shown in iperf, the performance on FreeBSD seem= s to avg at around 6Gbs but it is very sporadic during the iperf run.<br></= div><div><br></div><div>I have run out of ideas, any suggestions are welcom= e. Considering Netflix uses very similar HW and they push 400 Gb/s tells me= there is something really wrong here or Netflix isnt sharing all their sec= ret sauce.<br></div><div><br></div><div><br></div><div># ifconfig mce0<br><= /div><div>mce0: flags=3D8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> = metric 0 mtu 1500 <br></div><div>options=3Dffed07bb<RXCSUM,TXCSUM,VLAN_M= TU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HW= TSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6,TXRTLMT,HWRXTSTMP,NOMAP,TXTLS4,TXTLS6= ,VXLAN_HWCSUM,VXLAN_HWTSO,TXTLS_RTLMT><br></div><div> &= nbsp; ether b8:ce:f6:81:df:6a<br></div><div> &= nbsp; inet 192.168.10.31 netmask 0xffffff00 broadca= st 192.168.10.255<br></div><div> = media: Ethernet 25GBase-CR <full-duplex,rxpause,txpause><br></div><di= v> status: active<br></div><div>&= nbsp; nd6 options=3D29<PERFORMNUD,IF= DISABLED,AUTO_LINKLOCAL><br></div><div><br></div><div><br></div><div>[ro= ot@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 <br></div><div>Connecting to host d= b-01, port 5201<br></div><div>[ 5] local 192.168.10.31 port 64695 con= nected to 192.168.10.30 port 5201<br></div><div>[ ID] Interval &= nbsp; Transfer &= nbsp; Bitrate Retr Cw= nd<br></div><div>[ 5] 0.00-1.00 sec &nbs= p; 930 MBytes 7.80 Gbits/sec 62 789 KBy= tes<br></div><div>[ 5] 1.00-2.00 sec &nb= sp; 942 MBytes 7.90 Gbits/sec 164 824 KBytes<= br></div><div>[ 5] 2.00-3.00 sec 1.00 G= Bytes 8.61 Gbits/sec 402 879 KBytes<br></div>= <div>[ 5] 3.00-4.00 sec 761 MByte= s 6.39 Gbits/sec 61 588 KBytes<br></div= ><div>[ 5] 4.00-5.00 sec 724 MByt= es 6.07 Gbits/sec 220 497 KBytes<br></div><di= v>[ 5] 5.00-6.00 sec 723 MBytes&n= bsp; 6.07 Gbits/sec 54 364 KBytes<br></div><d= iv>[ 5] 6.00-7.00 sec 716 MBytes&= nbsp; 6.01 Gbits/sec 187 682 KBytes<br></div><div>[= 5] 7.00-8.00 sec 728 MBytes = ; 6.11 Gbits/sec 86 568 KBytes<br></div><div>= [ 5] 8.00-9.00 sec 761 MBytes&nbs= p; 6.39 Gbits/sec 37 418 KBytes<br></div><div= >[ 5] 9.00-10.00 sec 733 MBytes 6= .15 Gbits/sec 8 617 KBytes<br></div><di= v>[ 5] 10.00-11.00 sec 734 MBytes 6.16 = Gbits/sec 238 474 KBytes<br></div><div>[ 5]&n= bsp; 11.00-12.00 sec 736 MBytes 6.17 Gbits/sec = ; 164 757 KBytes<br></div><div>[ 5] 12.00-13.= 00 sec 610 MBytes 5.12 Gbits/sec 118 &nb= sp; 579 KBytes<br></div><div>[ 5] 13.00-14.00 sec&n= bsp; 1.02 GBytes 8.75 Gbits/sec 447 449 KByte= s<br></div><div>[ 5] 14.00-15.00 sec 728 MByt= es 6.11 Gbits/sec 132 719 KBytes<br></div><di= v>[ 5] 15.00-16.00 sec 724 MBytes 6.07 = Gbits/sec 185 649 KBytes<br></div><div>[ 5]&n= bsp; 16.00-17.00 sec 597 MBytes 5.01 Gbits/sec = ; 142 570 KBytes<br></div><div>[ 5] 17.00-18.= 00 sec 733 MBytes 6.15 Gbits/sec 102 &nb= sp; 484 KBytes<br></div><div>[ 5] 18.00-19.00 sec&n= bsp; 726 MBytes 6.09 Gbits/sec 15 = 569 KBytes<br></div><div>[ 5] 19.00-20.00 sec  = ; 733 MBytes 6.15 Gbits/sec 181 527 KBytes<br= ></div><div>[ 5] 20.00-21.00 sec 729 MBytes&n= bsp; 6.12 Gbits/sec 118 430 KBytes<br></div><div>[&= nbsp; 5] 21.00-22.00 sec 733 MBytes 6.15 Gbit= s/sec 116 641 KBytes<br></div><div>[ 5] = 22.00-23.00 sec 728 MBytes 6.10 Gbits/sec 18= 2 598 KBytes<br></div><div>[ 5] 23.00-24.00&n= bsp; sec 743 MBytes 6.24 Gbits/sec 209 &= nbsp; 614 KBytes<br></div><div>[ 5] 24.00-25.00 sec = 746 MBytes 6.26 Gbits/sec 72 758= KBytes<br></div><div>[ 5] 25.00-26.00 sec 74= 2 MBytes 6.23 Gbits/sec 199 675 KBytes<br></d= iv><div>[ 5] 26.00-27.00 sec 799 MBytes = 6.70 Gbits/sec 183 542 KBytes<br></div><div>[ = ; 5] 27.00-28.00 sec 908 MBytes 7.61 Gbits/se= c 7 1.19 MBytes<br></div><div>[ 5] = ; 28.00-29.00 sec 1.37 GBytes 11.7 Gbits/sec 606&nb= sp; 1013 KBytes<br></div><div>[ 5] 29.00-30.00 sec&= nbsp; 1.31 GBytes 11.3 Gbits/sec 74 1.02 MByt= es<br></div><div>- - - - - - - - - - - - - - - - - - - - - - - - -<br></div= ><div>[ ID] Interval &= nbsp; Transfer Bitrate  = ; Retr<br></div><div>[ 5] 0.00-30.00&nb= sp; sec 23.7 GBytes 6.79 Gbits/sec 4771 = sender<br></div><div= >[ 5] 0.00-30.00 sec 23.7 GBytes 6.79 G= bits/sec &= nbsp; receiver<br></div><div><br></div><div><= br></div><div>I have even tried changing to the RACK TCP stack, only to get= slightly better results, however with RACK the amount of retries is nearly= 0.<br></div><div><br></div><div>[root@db-02 ~]# sysctl net.inet.tcp.functi= ons_default=3Drack <br></div><div>net.inet.tcp.functions_default: rack ->= ; rack<br></div><div>[root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01<br></div><d= iv>[root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01<br></div><div>Connecting to h= ost db-01, port 5201<br></div><div>[ 5] local 192.168.10.31 port 5189= 4 connected to 192.168.10.30 port 5201<br></div><div>[ ID] Interval &n= bsp; Transfer &n= bsp; Bitrate Retr&nbs= p; Cwnd<br></div><div>[ 5] 0.00-1.00 sec = ; 761 MBytes 6.38 Gbits/sec 0 &nbs= p; 737 KBytes<br></div><div>[ 5] 1.00-2.00 se= c 859 MBytes 7.21 Gbits/sec 0 &nbs= p; 761 KBytes<br></div><div>[ 5] 2.00-3.00 &nb= sp; sec 880 MBytes 7.38 Gbits/sec 0&nbs= p; 785 KBytes<br></div><div>[ 5] 3.00-4.00&nb= sp; sec 734 MBytes 6.16 Gbits/sec = 0 804 KBytes<br></div><div>[ 5] 4.00-5= .00 sec 777 MBytes 6.52 Gbits/sec = 0 824 KBytes<br></div><div>[ 5] = 5.00-6.00 sec 719 MBytes 6.03 Gbits/sec = 0 841 KBytes<br></div><div>[ 5] &= nbsp; 6.00-7.00 sec 865 MBytes 7.26 Gbits/sec= 0 862 KBytes<br></div><div>[ 5]&= nbsp; 7.00-8.00 sec 880 MBytes 7.38 Gbi= ts/sec 0 882 KBytes<br></div><div>[&nbs= p; 5] 8.00-9.00 sec 906 MBytes 7.= 60 Gbits/sec 0 904 KBytes<br></div><div= >[ 5] 9.00-10.00 sec 749 MBytes 6= .29 Gbits/sec 0 921 KBytes<br></div><di= v>[ 5] 10.00-11.00 sec 798 MBytes 6.69 = Gbits/sec 0 938 KBytes<br></div><div>[&= nbsp; 5] 11.00-12.00 sec 746 MBytes 6.26 Gbit= s/sec 209 772 KBytes<br></div><div>[ 5] = 12.00-13.00 sec 768 MBytes 6.44 Gbits/sec &nb= sp; 35 644 KBytes<br></div><div>[ 5] 13.00-14= .00 sec 948 MBytes 7.95 Gbits/sec = 0 673 KBytes<br></div><div>[ 5] 14.00-15.00&= nbsp; sec 1.23 GBytes 10.5 Gbits/sec 0 &= nbsp; 711 KBytes<br></div><div>[ 5] 15.00-16.00 sec= 1.32 GBytes 11.4 Gbits/sec 0 &nbs= p; 748 KBytes<br></div><div>[ 5] 16.00-17.00 sec 1.= 31 GBytes 11.2 Gbits/sec 0 785 KB= ytes<br></div><div>[ 5] 17.00-18.00 sec 1.29 GBytes= 11.1 Gbits/sec 0 819 KBytes<br><= /div><div>[ 5] 18.00-19.00 sec 1.30 GBytes 11= .2 Gbits/sec 0 852 KBytes<br></div><div= >[ 5] 19.00-20.00 sec 1.34 GBytes 11.5 Gbits/= sec 0 883 KBytes<br></div><div>[ = 5] 20.00-21.00 sec 1.29 GBytes 11.1 Gbits/sec = 0 914 KBytes<br></div><div>[ 5] = 21.00-22.00 sec 1.36 GBytes 11.7 Gbits/sec &nb= sp; 0 944 KBytes<br></div><div>[ 5] 22.00-23.= 00 sec 1.33 GBytes 11.4 Gbits/sec 0&nbs= p; 974 KBytes<br></div><div>[ 5] 23.00-24.00 = sec 1.31 GBytes 11.2 Gbits/sec 0 = 1003 KBytes<br></div><div>[ 5] 24.00-25.00 sec 1.30= GBytes 11.2 Gbits/sec 0 1.00 MBytes<br= ></div><div>[ 5] 25.00-26.00 sec 1.34 GBytes = 11.5 Gbits/sec 0 1.03 MBytes<br></div><div>[&= nbsp; 5] 26.00-27.00 sec 1.32 GBytes 11.3 Gbits/sec= 0 1.06 MBytes<br></div><div>[ 5] = 27.00-28.00 sec 957 MBytes 8.03 Gbits/sec &nb= sp; 0 1.07 MBytes<br></div><div>[ 5] 28.00-29= .00 sec 837 MBytes 7.02 Gbits/sec = 0 1.09 MBytes<br></div><div>[ 5] 29.00-30.00 = sec 729 MBytes 6.11 Gbits/sec 0 &= nbsp; 1.10 MBytes<br></div><div>- - - - - - - - - - - - - - - - - - - - - -= - - -<br></div><div>[ ID] Interval &nbs= p; Transfer Bitrate &n= bsp; Retr<br></div><div>[ 5]  = ; 0.00-30.00 sec 30.6 GBytes 8.77 Gbits/sec 244&nbs= p; sender= <br></div><div>[ 5] 0.00-30.00 sec 30.6 GByte= s 8.77 Gbits/sec  = ; receiver<br></div><div><b= r></div><div><br></div><div><br></div><div>More data can be found @ <a= target=3D"_blank" href=3D"https://forums.freebsd.org/threads/poor-performa= nce-with-stable-13-and-mellanox-connectx-6-mlx5.85460/">https://forums.free= bsd.org/threads/poor-performance-with-stable-13-and-mellanox-connectx-6-mlx= 5.85460/</a></div><div><br></div><div><br></div><div><br></div><div class= =3D"x_-542995516zmail_signature_below"><div id=3D""><div>Mike Jakubik<br></= div><div><a target=3D"_blank" href=3D"https://www.swiftsmsgateway.com/">htt= ps://www.swiftsmsgateway.com/</a><br></div><div><br></div><div><span class= =3D"size" style=3D"font-size:10px">Disclaimer: This e-mail and any attachme= nts are intended only for the use of the addressee(s) and may contain infor= mation that is privileged or confidential. If you are not the intended reci= pient, or responsible for delivering the information to the intended recipi= ent, you are hereby notified that any dissemination, distribution, printing= or copying of this e-mail and any attachments is strictly prohibited. If t= his e-mail and any attachments were received in error, please notify the se= nder by reply e-mail and delete the original message.</span><br></div></div= ></div><div><br></div></div><br></div><div><br></div></div><br></body></htm= l> ------=_Part_3768786_1051015825.1655144736888--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1815e506878.cf301a5a1195924.6506017618978817828>