Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 10 Jul 2022 10:30:09 +0200
From:      Vincenzo Maffione <vmaffione@freebsd.org>
To:        Nils Beyer <nbe@vkf-renzel.de>
Cc:        FreeBSD virtualization <freebsd-virtualization@freebsd.org>
Subject:   Re: bhyve: slow network throughput between guest VM and host (and vice versa)?
Message-ID:  <CA%2B_eA9gYKeABjaBE8PvMstvx23-1O_MQFprvFUYLcLervHvW8A@mail.gmail.com>
In-Reply-To: <c68c6a34-61f7-48a8-19b8-43a253d3826d@vkf-renzel.de>
References:  <c68c6a34-61f7-48a8-19b8-43a253d3826d@vkf-renzel.de>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000fce87105e36f3e94
Content-Type: text/plain; charset="UTF-8"

Hi,
  The trick to achieve larger TCP throughput between the VM and the host
(or other co-located VM) is to have TCP Segmentation Offload (TSO) and TCP
Checksum Offload enabled along the whole packet data path (virtio-net,
bhyve, if_tap, if_bridge, ...).
When these offloads are in place, your TCP endpoint will be able to pass
around 64K packets (irrespective of virtio interface MTU), never performing
TCP segmentation or checksum computation in software. In practice, you
would have the same effect that you have if you set MTU=64K along the
packet datapath.

This is how Linux is able to achieve ~40 Gbps TCP throughput between
virtio-net VM and host if_tap (or if_bridge) interface.
FreeBSD is still missing such optimizations for bhyve, if_tap and
if_bridge. Partly, this is due to virtio-net offload-related specification
being designed to match Linux kernel packet metadata (struct sk_buff).
FreeBSD metadata is different, and conversion is not cheap, requiring some
packet header inspection.

Cheers,
  Vincenzo

Il giorno gio 30 giu 2022 alle ore 12:46 Nils Beyer <nbe@vkf-renzel.de> ha
scritto:

> Hi,
>
> I've setup a FreeBSD VM and am using a VirtIO network interface within.
> Then I've setup
> an ip address 192.168.0.2/30 to that VirtIO NIC of the guest using:
>
>         ifconfig vtnet0 192.168.0.2/30 up
>
> and on the TAP interface of the host an ip address 192.168.0.1/30 using
>
>         ifconfig tap0 192.168.0.1/30 up
>
> Trying an iperf3-transfer (iperf3-server on the host, iperf3-client on the
> guest) with a TCP window size of
> 128k I only get around 2.45Gbit/s:
>
> > # env LD_LIBRARY_PATH=. ./iperf3 -c 192.168.0.1 -w 128k
> > Connecting to host 192.168.0.1, port 5201
> > [  5] local 192.168.0.2 port 25651 connected to 192.168.0.1 port 5201
> > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > [  5]   0.00-1.00   sec   314 MBytes  2.63 Gbits/sec    0   1.43 MBytes
>
> > [  5]   1.00-2.00   sec   301 MBytes  2.52 Gbits/sec    0   1.43 MBytes
>
> > [  5]   2.00-3.00   sec   264 MBytes  2.21 Gbits/sec    0   1.43 MBytes
>
> > [  5]   3.00-4.00   sec   284 MBytes  2.38 Gbits/sec    0   1.43 MBytes
>
> > [  5]   4.00-5.00   sec   296 MBytes  2.48 Gbits/sec    0   1.43 MBytes
>
> > [  5]   5.00-6.00   sec   279 MBytes  2.34 Gbits/sec    0   1.43 MBytes
>
> > [  5]   6.00-7.00   sec   280 MBytes  2.35 Gbits/sec    0   1.43 MBytes
>
> > [  5]   7.00-8.00   sec   310 MBytes  2.60 Gbits/sec    0   1.43 MBytes
>
> > [  5]   8.00-9.00   sec   302 MBytes  2.53 Gbits/sec    0   1.43 MBytes
>
> > [  5]   9.00-10.00  sec   333 MBytes  2.79 Gbits/sec    0   1.43 MBytes
>
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bitrate         Retr
> > [  5]   0.00-10.00  sec  2.89 GBytes  2.49 Gbits/sec    0
>  sender
> > [  5]   0.00-10.00  sec  2.89 GBytes  2.49 Gbits/sec
> receiver
>
>
> Switching the roles (iperf3-server on the guest, iperf3-client on the
> host) with a TCP windows size of 128k,
> I get 4.04Gbit/s:
>
> > #iperf3 -c 192.168.0.2 -w 128k
> > Connecting to host 192.168.0.2, port 5201
> > [  5] local 192.168.0.1 port 56892 connected to 192.168.0.2 port 5201
> > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > [  5]   0.00-1.00   sec   411 MBytes  3.44 Gbits/sec   40    973 KBytes
>
> > [  5]   1.00-2.00   sec   483 MBytes  4.05 Gbits/sec    5   1.03 MBytes
>
> > [  5]   2.00-3.00   sec   507 MBytes  4.26 Gbits/sec    1   1.21 MBytes
>
> > [  5]   3.00-4.00   sec   514 MBytes  4.31 Gbits/sec   15    561 KBytes
>
> > [  5]   4.00-5.00   sec   498 MBytes  4.18 Gbits/sec   10    966 KBytes
>
> > [  5]   5.00-6.00   sec   491 MBytes  4.12 Gbits/sec   19    841 KBytes
>
> > [  5]   6.00-7.00   sec   513 MBytes  4.31 Gbits/sec    0   1.43 MBytes
>
> > [  5]   7.00-8.00   sec   504 MBytes  4.23 Gbits/sec    0   1.43 MBytes
>
> > [  5]   8.00-9.00   sec   459 MBytes  3.85 Gbits/sec    0   1.43 MBytes
>
> > [  5]   9.00-10.00  sec   435 MBytes  3.65 Gbits/sec    0   1.43 MBytes
>
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bitrate         Retr
> > [  5]   0.00-10.00  sec  4.70 GBytes  4.04 Gbits/sec   90
>  sender
> > [  5]   0.00-10.00  sec  4.70 GBytes  4.04 Gbits/sec
> receiver
>
>
> Increasing MTU on the Virtio interface and on the TAP interface to 9000
> helps a little bit:
> getting 8.38Gbit/s guest->host and 10.3Gbit/s host->guest.
>
> Increasing TCP windows size to 1024k only produces more retries and does
> nothing on the
> throughput.
>
> Is that expected that I'm not able to get more throughput within the bhyve
> network
> stack (guest <-> host)? I was expecting way more then 10Gbit/s...
>
>
>
> TIA and KR,
> Nils
>
>

--000000000000fce87105e36f3e94
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hi,</div><div>=C2=A0 The trick to achieve larger TCP =
throughput between the VM and the host (or other co-located VM) is to have =
TCP Segmentation Offload (TSO) and TCP Checksum Offload enabled along the w=
hole packet data path (virtio-net, bhyve, if_tap, if_bridge, ...).</div><di=
v>When these offloads are in place, your TCP endpoint will be able to pass =
around 64K packets (irrespective of virtio interface MTU), never performing=
 TCP segmentation or checksum computation in software. In practice, you wou=
ld have the same effect that you have if you set MTU=3D64K along the packet=
 datapath.</div><div><br></div><div>This is how Linux is able to achieve ~4=
0 Gbps TCP throughput between virtio-net VM and host if_tap (or if_bridge) =
interface.</div><div>FreeBSD is still missing such optimizations for bhyve,=
 if_tap and if_bridge. Partly, this is due to virtio-net offload-related sp=
ecification being designed to match Linux kernel packet metadata (struct sk=
_buff). FreeBSD metadata is different, and conversion is not cheap, requiri=
ng some packet header inspection.<br></div><div><br></div><div>Cheers,</div=
><div>=C2=A0 Vincenzo<br></div></div><br><div class=3D"gmail_quote"><div di=
r=3D"ltr" class=3D"gmail_attr">Il giorno gio 30 giu 2022 alle ore 12:46 Nil=
s Beyer &lt;<a href=3D"mailto:nbe@vkf-renzel.de">nbe@vkf-renzel.de</a>&gt; =
ha scritto:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px =
0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<=
br>
<br>
I&#39;ve setup a FreeBSD VM and am using a VirtIO network interface within.=
 Then I&#39;ve setup<br>
an ip address <a href=3D"http://192.168.0.2/30" rel=3D"noreferrer" target=
=3D"_blank">192.168.0.2/30</a> to that VirtIO NIC of the guest using:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ifconfig vtnet0 <a href=3D"http://192.168.0.2/3=
0" rel=3D"noreferrer" target=3D"_blank">192.168.0.2/30</a> up<br>
<br>
and on the TAP interface of the host an ip address <a href=3D"http://192.16=
8.0.1/30" rel=3D"noreferrer" target=3D"_blank">192.168.0.1/30</a> using<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ifconfig tap0 <a href=3D"http://192.168.0.1/30"=
 rel=3D"noreferrer" target=3D"_blank">192.168.0.1/30</a> up<br>
<br>
Trying an iperf3-transfer (iperf3-server on the host, iperf3-client on the =
guest) with a TCP window size of<br>
128k I only get around 2.45Gbit/s:<br>
<br>
&gt; # env LD_LIBRARY_PATH=3D. ./iperf3 -c 192.168.0.1 -w 128k<br>
&gt; Connecting to host 192.168.0.1, port 5201<br>
&gt; [=C2=A0 5] local 192.168.0.2 port 25651 connected to 192.168.0.1 port =
5201<br>
&gt; [ ID] Interval=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Transfer=C2=A0 =
=C2=A0 =C2=A0Bitrate=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Retr=C2=A0 Cwnd<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A00.00-1.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0314 MByte=
s=C2=A0 2.63 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A01.00-2.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0301 MByte=
s=C2=A0 2.52 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A02.00-3.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0264 MByte=
s=C2=A0 2.21 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A03.00-4.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0284 MByte=
s=C2=A0 2.38 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A04.00-5.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0296 MByte=
s=C2=A0 2.48 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A05.00-6.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0279 MByte=
s=C2=A0 2.34 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A06.00-7.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0280 MByte=
s=C2=A0 2.35 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A07.00-8.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0310 MByte=
s=C2=A0 2.60 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A08.00-9.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0302 MByte=
s=C2=A0 2.53 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A09.00-10.00=C2=A0 sec=C2=A0 =C2=A0333 MBytes=C2=
=A0 2.79 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =C2=
=A0 =C2=A0<br>
&gt; - - - - - - - - - - - - - - - - - - - - - - - - -<br>
&gt; [ ID] Interval=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Transfer=C2=A0 =
=C2=A0 =C2=A0Bitrate=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Retr<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A00.00-10.00=C2=A0 sec=C2=A0 2.89 GBytes=C2=A0 2.=
49 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
sender<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A00.00-10.00=C2=A0 sec=C2=A0 2.89 GBytes=C2=A0 2.=
49 Gbits/sec=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
receiver<br>
<br>
<br>
Switching the roles (iperf3-server on the guest, iperf3-client on the host)=
 with a TCP windows size of 128k,<br>
I get 4.04Gbit/s:<br>
<br>
&gt; #iperf3 -c 192.168.0.2 -w 128k<br>
&gt; Connecting to host 192.168.0.2, port 5201<br>
&gt; [=C2=A0 5] local 192.168.0.1 port 56892 connected to 192.168.0.2 port =
5201<br>
&gt; [ ID] Interval=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Transfer=C2=A0 =
=C2=A0 =C2=A0Bitrate=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Retr=C2=A0 Cwnd<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A00.00-1.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0411 MByte=
s=C2=A0 3.44 Gbits/sec=C2=A0 =C2=A040=C2=A0 =C2=A0 973 KBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A01.00-2.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0483 MByte=
s=C2=A0 4.05 Gbits/sec=C2=A0 =C2=A0 5=C2=A0 =C2=A01.03 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A02.00-3.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0507 MByte=
s=C2=A0 4.26 Gbits/sec=C2=A0 =C2=A0 1=C2=A0 =C2=A01.21 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A03.00-4.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0514 MByte=
s=C2=A0 4.31 Gbits/sec=C2=A0 =C2=A015=C2=A0 =C2=A0 561 KBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A04.00-5.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0498 MByte=
s=C2=A0 4.18 Gbits/sec=C2=A0 =C2=A010=C2=A0 =C2=A0 966 KBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A05.00-6.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0491 MByte=
s=C2=A0 4.12 Gbits/sec=C2=A0 =C2=A019=C2=A0 =C2=A0 841 KBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A06.00-7.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0513 MByte=
s=C2=A0 4.31 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A07.00-8.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0504 MByte=
s=C2=A0 4.23 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A08.00-9.00=C2=A0 =C2=A0sec=C2=A0 =C2=A0459 MByte=
s=C2=A0 3.85 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =
=C2=A0 =C2=A0<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A09.00-10.00=C2=A0 sec=C2=A0 =C2=A0435 MBytes=C2=
=A0 3.65 Gbits/sec=C2=A0 =C2=A0 0=C2=A0 =C2=A01.43 MBytes=C2=A0 =C2=A0 =C2=
=A0 =C2=A0<br>
&gt; - - - - - - - - - - - - - - - - - - - - - - - - -<br>
&gt; [ ID] Interval=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Transfer=C2=A0 =
=C2=A0 =C2=A0Bitrate=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Retr<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A00.00-10.00=C2=A0 sec=C2=A0 4.70 GBytes=C2=A0 4.=
04 Gbits/sec=C2=A0 =C2=A090=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
sender<br>
&gt; [=C2=A0 5]=C2=A0 =C2=A00.00-10.00=C2=A0 sec=C2=A0 4.70 GBytes=C2=A0 4.=
04 Gbits/sec=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
receiver<br>
<br>
<br>
Increasing MTU on the Virtio interface and on the TAP interface to 9000 hel=
ps a little bit:<br>
getting 8.38Gbit/s guest-&gt;host and 10.3Gbit/s host-&gt;guest.<br>
<br>
Increasing TCP windows size to 1024k only produces more retries and does no=
thing on the<br>
throughput.<br>
<br>
Is that expected that I&#39;m not able to get more throughput within the bh=
yve network<br>
stack (guest &lt;-&gt; host)? I was expecting way more then 10Gbit/s...<br>
<br>
<br>
<br>
TIA and KR,<br>
Nils<br>
<br>
</blockquote></div>

--000000000000fce87105e36f3e94--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2B_eA9gYKeABjaBE8PvMstvx23-1O_MQFprvFUYLcLervHvW8A>