Date: Sun, 10 Jul 2022 10:30:09 +0200 From: Vincenzo Maffione <vmaffione@freebsd.org> To: Nils Beyer <nbe@vkf-renzel.de> Cc: FreeBSD virtualization <freebsd-virtualization@freebsd.org> Subject: Re: bhyve: slow network throughput between guest VM and host (and vice versa)? Message-ID: <CA%2B_eA9gYKeABjaBE8PvMstvx23-1O_MQFprvFUYLcLervHvW8A@mail.gmail.com> In-Reply-To: <c68c6a34-61f7-48a8-19b8-43a253d3826d@vkf-renzel.de> References: <c68c6a34-61f7-48a8-19b8-43a253d3826d@vkf-renzel.de>
next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000fce87105e36f3e94 Content-Type: text/plain; charset="UTF-8" Hi, The trick to achieve larger TCP throughput between the VM and the host (or other co-located VM) is to have TCP Segmentation Offload (TSO) and TCP Checksum Offload enabled along the whole packet data path (virtio-net, bhyve, if_tap, if_bridge, ...). When these offloads are in place, your TCP endpoint will be able to pass around 64K packets (irrespective of virtio interface MTU), never performing TCP segmentation or checksum computation in software. In practice, you would have the same effect that you have if you set MTU=64K along the packet datapath. This is how Linux is able to achieve ~40 Gbps TCP throughput between virtio-net VM and host if_tap (or if_bridge) interface. FreeBSD is still missing such optimizations for bhyve, if_tap and if_bridge. Partly, this is due to virtio-net offload-related specification being designed to match Linux kernel packet metadata (struct sk_buff). FreeBSD metadata is different, and conversion is not cheap, requiring some packet header inspection. Cheers, Vincenzo Il giorno gio 30 giu 2022 alle ore 12:46 Nils Beyer <nbe@vkf-renzel.de> ha scritto: > Hi, > > I've setup a FreeBSD VM and am using a VirtIO network interface within. > Then I've setup > an ip address 192.168.0.2/30 to that VirtIO NIC of the guest using: > > ifconfig vtnet0 192.168.0.2/30 up > > and on the TAP interface of the host an ip address 192.168.0.1/30 using > > ifconfig tap0 192.168.0.1/30 up > > Trying an iperf3-transfer (iperf3-server on the host, iperf3-client on the > guest) with a TCP window size of > 128k I only get around 2.45Gbit/s: > > > # env LD_LIBRARY_PATH=. ./iperf3 -c 192.168.0.1 -w 128k > > Connecting to host 192.168.0.1, port 5201 > > [ 5] local 192.168.0.2 port 25651 connected to 192.168.0.1 port 5201 > > [ ID] Interval Transfer Bitrate Retr Cwnd > > [ 5] 0.00-1.00 sec 314 MBytes 2.63 Gbits/sec 0 1.43 MBytes > > > [ 5] 1.00-2.00 sec 301 MBytes 2.52 Gbits/sec 0 1.43 MBytes > > > [ 5] 2.00-3.00 sec 264 MBytes 2.21 Gbits/sec 0 1.43 MBytes > > > [ 5] 3.00-4.00 sec 284 MBytes 2.38 Gbits/sec 0 1.43 MBytes > > > [ 5] 4.00-5.00 sec 296 MBytes 2.48 Gbits/sec 0 1.43 MBytes > > > [ 5] 5.00-6.00 sec 279 MBytes 2.34 Gbits/sec 0 1.43 MBytes > > > [ 5] 6.00-7.00 sec 280 MBytes 2.35 Gbits/sec 0 1.43 MBytes > > > [ 5] 7.00-8.00 sec 310 MBytes 2.60 Gbits/sec 0 1.43 MBytes > > > [ 5] 8.00-9.00 sec 302 MBytes 2.53 Gbits/sec 0 1.43 MBytes > > > [ 5] 9.00-10.00 sec 333 MBytes 2.79 Gbits/sec 0 1.43 MBytes > > > - - - - - - - - - - - - - - - - - - - - - - - - - > > [ ID] Interval Transfer Bitrate Retr > > [ 5] 0.00-10.00 sec 2.89 GBytes 2.49 Gbits/sec 0 > sender > > [ 5] 0.00-10.00 sec 2.89 GBytes 2.49 Gbits/sec > receiver > > > Switching the roles (iperf3-server on the guest, iperf3-client on the > host) with a TCP windows size of 128k, > I get 4.04Gbit/s: > > > #iperf3 -c 192.168.0.2 -w 128k > > Connecting to host 192.168.0.2, port 5201 > > [ 5] local 192.168.0.1 port 56892 connected to 192.168.0.2 port 5201 > > [ ID] Interval Transfer Bitrate Retr Cwnd > > [ 5] 0.00-1.00 sec 411 MBytes 3.44 Gbits/sec 40 973 KBytes > > > [ 5] 1.00-2.00 sec 483 MBytes 4.05 Gbits/sec 5 1.03 MBytes > > > [ 5] 2.00-3.00 sec 507 MBytes 4.26 Gbits/sec 1 1.21 MBytes > > > [ 5] 3.00-4.00 sec 514 MBytes 4.31 Gbits/sec 15 561 KBytes > > > [ 5] 4.00-5.00 sec 498 MBytes 4.18 Gbits/sec 10 966 KBytes > > > [ 5] 5.00-6.00 sec 491 MBytes 4.12 Gbits/sec 19 841 KBytes > > > [ 5] 6.00-7.00 sec 513 MBytes 4.31 Gbits/sec 0 1.43 MBytes > > > [ 5] 7.00-8.00 sec 504 MBytes 4.23 Gbits/sec 0 1.43 MBytes > > > [ 5] 8.00-9.00 sec 459 MBytes 3.85 Gbits/sec 0 1.43 MBytes > > > [ 5] 9.00-10.00 sec 435 MBytes 3.65 Gbits/sec 0 1.43 MBytes > > > - - - - - - - - - - - - - - - - - - - - - - - - - > > [ ID] Interval Transfer Bitrate Retr > > [ 5] 0.00-10.00 sec 4.70 GBytes 4.04 Gbits/sec 90 > sender > > [ 5] 0.00-10.00 sec 4.70 GBytes 4.04 Gbits/sec > receiver > > > Increasing MTU on the Virtio interface and on the TAP interface to 9000 > helps a little bit: > getting 8.38Gbit/s guest->host and 10.3Gbit/s host->guest. > > Increasing TCP windows size to 1024k only produces more retries and does > nothing on the > throughput. > > Is that expected that I'm not able to get more throughput within the bhyve > network > stack (guest <-> host)? I was expecting way more then 10Gbit/s... > > > > TIA and KR, > Nils > > --000000000000fce87105e36f3e94 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div>Hi,</div><div>=C2=A0 The trick to achieve larger TCP = throughput between the VM and the host (or other co-located VM) is to have = TCP Segmentation Offload (TSO) and TCP Checksum Offload enabled along the w= hole packet data path (virtio-net, bhyve, if_tap, if_bridge, ...).</div><di= v>When these offloads are in place, your TCP endpoint will be able to pass = around 64K packets (irrespective of virtio interface MTU), never performing= TCP segmentation or checksum computation in software. In practice, you wou= ld have the same effect that you have if you set MTU=3D64K along the packet= datapath.</div><div><br></div><div>This is how Linux is able to achieve ~4= 0 Gbps TCP throughput between virtio-net VM and host if_tap (or if_bridge) = interface.</div><div>FreeBSD is still missing such optimizations for bhyve,= if_tap and if_bridge. Partly, this is due to virtio-net offload-related sp= ecification being designed to match Linux kernel packet metadata (struct sk= _buff). FreeBSD metadata is different, and conversion is not cheap, requiri= ng some packet header inspection.<br></div><div><br></div><div>Cheers,</div= ><div>=C2=A0 Vincenzo<br></div></div><br><div class=3D"gmail_quote"><div di= r=3D"ltr" class=3D"gmail_attr">Il giorno gio 30 giu 2022 alle ore 12:46 Nil= s Beyer <<a href=3D"mailto:nbe@vkf-renzel.de">nbe@vkf-renzel.de</a>> = ha scritto:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px = 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<= br> <br> I've setup a FreeBSD VM and am using a VirtIO network interface within.= Then I've setup<br> an ip address <a href=3D"http://192.168.0.2/30" rel=3D"noreferrer" target= =3D"_blank">192.168.0.2/30</a> to that VirtIO NIC of the guest using:<br> <br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 ifconfig vtnet0 <a href=3D"http://192.168.0.2/3= 0" rel=3D"noreferrer" target=3D"_blank">192.168.0.2/30</a> up<br> <br> and on the TAP interface of the host an ip address <a href=3D"http://192.16= 8.0.1/30" rel=3D"noreferrer" target=3D"_blank">192.168.0.1/30</a> using<br> <br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 ifconfig tap0 <a href=3D"http://192.168.0.1/30"= rel=3D"noreferrer" target=3D"_blank">192.168.0.1/30</a> up<br> <br> Trying an iperf3-transfer (iperf3-server on the host, iperf3-client on the = guest) with a TCP window size of<br> 128k I only get around 2.45Gbit/s:<br> <br> > # env LD_LIBRARY_PATH=3D. ./iperf3 -c 192.168.0.1 -w 128k<br> > Connecting to host 192.168.0.1, port 5201<br> > [=C2=A0 5] local 192.168.0.2 port 25651 connected to 192.168.0.1 port = 5201<br> > [ ID] Interval=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Transfer=C2=A0 = =C2=A0 =C2=A0Bitrate=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Retr=C2=A0 Cwnd<br> > [=C2=A0 5]=C2=A0 =C2=A00.00-1.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0314 MByte= s=C2=A0 2.63 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A01.00-2.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0301 MByte= s=C2=A0 2.52 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A02.00-3.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0264 MByte= s=C2=A0 2.21 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A03.00-4.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0284 MByte= s=C2=A0 2.38 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A04.00-5.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0296 MByte= s=C2=A0 2.48 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A05.00-6.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0279 MByte= s=C2=A0 2.34 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A06.00-7.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0280 MByte= s=C2=A0 2.35 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A07.00-8.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0310 MByte= s=C2=A0 2.60 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A08.00-9.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0302 MByte= s=C2=A0 2.53 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A09.00-10.00=C2=A0 sec=C2=A0 =C2=A0333 MBytes=C2= =A0 2.79 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =C2= =A0 =C2=A0<br> > - - - - - - - - - - - - - - - - - - - - - - - - -<br> > [ ID] Interval=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Transfer=C2=A0 = =C2=A0 =C2=A0Bitrate=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Retr<br> > [=C2=A0 5]=C2=A0 =C2=A00.00-10.00=C2=A0 sec=C2=A0 2.89 GBytes=C2=A0 2.= 49 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= sender<br> > [=C2=A0 5]=C2=A0 =C2=A00.00-10.00=C2=A0 sec=C2=A0 2.89 GBytes=C2=A0 2.= 49 Gbits/sec=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = receiver<br> <br> <br> Switching the roles (iperf3-server on the guest, iperf3-client on the host)= with a TCP windows size of 128k,<br> I get 4.04Gbit/s:<br> <br> > #iperf3 -c 192.168.0.2 -w 128k<br> > Connecting to host 192.168.0.2, port 5201<br> > [=C2=A0 5] local 192.168.0.1 port 56892 connected to 192.168.0.2 port = 5201<br> > [ ID] Interval=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Transfer=C2=A0 = =C2=A0 =C2=A0Bitrate=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Retr=C2=A0 Cwnd<br> > [=C2=A0 5]=C2=A0 =C2=A00.00-1.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0411 MByte= s=C2=A0 3.44 Gbits/sec=C2=A0 =C2=A040=C2=A0 =C2=A0 973 KBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A01.00-2.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0483 MByte= s=C2=A0 4.05 Gbits/sec=C2=A0 =C2=A0 5=C2=A0 =C2=A01.03 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A02.00-3.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0507 MByte= s=C2=A0 4.26 Gbits/sec=C2=A0 =C2=A0 1=C2=A0 =C2=A01.21 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A03.00-4.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0514 MByte= s=C2=A0 4.31 Gbits/sec=C2=A0 =C2=A015=C2=A0 =C2=A0 561 KBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A04.00-5.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0498 MByte= s=C2=A0 4.18 Gbits/sec=C2=A0 =C2=A010=C2=A0 =C2=A0 966 KBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A05.00-6.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0491 MByte= s=C2=A0 4.12 Gbits/sec=C2=A0 =C2=A019=C2=A0 =C2=A0 841 KBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A06.00-7.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0513 MByte= s=C2=A0 4.31 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A07.00-8.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0504 MByte= s=C2=A0 4.23 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A08.00-9.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0459 MByte= s=C2=A0 3.85 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 = =C2=A0 =C2=A0<br> > [=C2=A0 5]=C2=A0 =C2=A09.00-10.00=C2=A0 sec=C2=A0 =C2=A0435 MBytes=C2= =A0 3.65 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =C2= =A0 =C2=A0<br> > - - - - - - - - - - - - - - - - - - - - - - - - -<br> > [ ID] Interval=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Transfer=C2=A0 = =C2=A0 =C2=A0Bitrate=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Retr<br> > [=C2=A0 5]=C2=A0 =C2=A00.00-10.00=C2=A0 sec=C2=A0 4.70 GBytes=C2=A0 4.= 04 Gbits/sec=C2=A0 =C2=A090=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= sender<br> > [=C2=A0 5]=C2=A0 =C2=A00.00-10.00=C2=A0 sec=C2=A0 4.70 GBytes=C2=A0 4.= 04 Gbits/sec=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = receiver<br> <br> <br> Increasing MTU on the Virtio interface and on the TAP interface to 9000 hel= ps a little bit:<br> getting 8.38Gbit/s guest->host and 10.3Gbit/s host->guest.<br> <br> Increasing TCP windows size to 1024k only produces more retries and does no= thing on the<br> throughput.<br> <br> Is that expected that I'm not able to get more throughput within the bh= yve network<br> stack (guest <-> host)? I was expecting way more then 10Gbit/s...<br> <br> <br> <br> TIA and KR,<br> Nils<br> <br> </blockquote></div> --000000000000fce87105e36f3e94--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B_eA9gYKeABjaBE8PvMstvx23-1O_MQFprvFUYLcLervHvW8A>