Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Sep 2024 11:25:07 +0800
From:      Zhenlei Huang <zlei@FreeBSD.org>
To:        Aleksandr Fedorov <wigneddoom@yandex.ru>
Cc:        Sad Clouds <cryintothebluesky@gmail.com>, Mark Saad <nonesuch@longcount.org>, FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: Performance issues with vnet jails + epair + bridge
Message-ID:  <7E65BDDA-7105-4C07-9243-4FAF2B4D5515@FreeBSD.org>
In-Reply-To: <214411726497902@mail.yandex.ru>
References:  <20240913100938.3eac55c9fbd976fa72d58bb5@gmail.com> <39B2C95D-1E4F-4133-8923-AD305DFA9435@longcount.org> <20240913155439.1e171a88bd01ce9b97558a90@gmail.com> <A95066A8-F5FC-451B-85CE-C463952ABADE@FreeBSD.org> <214411726497902@mail.yandex.ru>

next in thread | previous in thread | raw e-mail | index | archive | help


> On Sep 16, 2024, at 10:47 PM, Aleksandr Fedorov <wigneddoom@yandex.ru> =
wrote:
>=20
> If we are talking about local traffic between jails and/or host, then =
in terms of TCP throughput we have a room to improve, for example:

Without RSS option enabled, if_epair will only use one thread to move =
packets between the pair of interfaces. I reviewed the code
and I think it can be improved event without RSS.

> =20
> 1. Stop calculating checksums for packets between VNET jails and host.

I've local WIP for this, inspired by the introduction of IFCAP_VLAN_MTU. =
Should have better improvement especially  on low freq CPUs.

> =20
> 2. Use large packets (TSO) up to 64k in size.
> =20
> Just for example, a simple patch increases the throughput of =
if_pair(4) between two ends from 10 Gbps to 30 Gbps.

That is impressing !

> =20
> diff --git a/sys/net/if_epair.c b/sys/net/if_epair.c
> index aeed993249f5..79c2dfcfc445 100644
> --- a/sys/net/if_epair.c
> +++ b/sys/net/if_epair.c
> @@ -164,6 +164,10 @@ epair_tx_start_deferred(void *arg, int pending)
>         while (m !=3D NULL) {
>                 n =3D STAILQ_NEXT(m, m_stailqpkt);
>                 m->m_nextpkt =3D NULL;
> +
> +               m->m_pkthdr.csum_flags =3D CSUM_IP_CHECKED | =
CSUM_IP_VALID | CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
> +               m->m_pkthdr.csum_data =3D 0xFFFF;
> +
>                 if_input(ifp, m);
>                 m =3D n;
>         }
> @@ -538,8 +542,9 @@ epair_setup_ifp(struct epair_softc *sc, char =
*name, int unit)
>         ifp->if_dunit =3D unit;
>         ifp->if_flags =3D IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
>         ifp->if_flags |=3D IFF_KNOWSEPOCH;
> -       ifp->if_capabilities =3D IFCAP_VLAN_MTU;
> -       ifp->if_capenable =3D IFCAP_VLAN_MTU;
> +       ifp->if_capabilities =3D IFCAP_VLAN_MTU | IFCAP_HWCSUM | =
IFCAP_HWCSUM_IPV6 | IFCAP_TSO;
> +       ifp->if_capenable =3D ifp->if_capabilities;
> +       ifp->if_hwassist =3D (CSUM_IP | CSUM_TCP | CSUM_UDP | =
CSUM_IP_TSO);

I've not tried TSO on if_epair yet. TSO has special treatment so I guess =
the above is not sufficient.

>         ifp->if_transmit =3D epair_transmit;
>         ifp->if_qflush =3D epair_qflush;
>         ifp->if_start =3D epair_start;
> =20
> 14.09.2024, 05:45, "Zhenlei Huang" <zlei@freebsd.org>:
> =20
> =20
>=20
>  On Sep 13, 2024, at 10:54 PM, Sad Clouds =
<cryintothebluesky@gmail.com> wrote:
> =20
>  On Fri, 13 Sep 2024 08:08:02 -0400
>  Mark Saad <nonesuch@longcount.org> wrote:
> =20
>  Sad
>    Can you go back a bit you mentioned there is a RPi in the mix ? =
Some of the raspberries have their nic usb attached under the covers . =
Which will kill the total speed of things.
> =20
>  Can you cobble together a diagram of what you have on either end ?=20
>  Hello, I'm not sending data across the network, only between the host
>  and the jails. I'm trying to evaluate how FreeBSD handles TCP data
>  locally within a single host.
>=20
> When you take vnet into account, the **locally** traffic should within
> on single vnet jail. If you want traffic across vnet jails, if_epair =
or netgraph
> hooks should be employed, and it of course will introduce some =
overhead.
>=20
> =20
>  I understand that vnet jails will have more overhead, compared to a
>  shared TCP/IP stack via localhost. So I'm trying to measure it and =
see
>  where the bottlenecks are.
>=20
> The overhead of vnet jail should be neglectable, compared to legacy =
jail
> or no-jail. Bare in mind when VIMAGE option is enabled, there is a =
default
> vnet 0. It is not visible via jls and can not be destroyed. So when =
you see
> bottlenecks, for example this case, it is mostly caused by other =
components
> such as if_epair, but not the vnet jail itself.
>=20
> =20
>  The Raspberry Pi 4 host has a single vnet jail, exchanging data with
>  the host via epair(4) and if_bridge(4) interfaces. I don't really =
know
>  what topology FreeBSD is using to represent all this so can't draw =
any
>  diagrams, but I think all data flows through the kernel internally =
and
>  never leaves the physical network interface.
>=20
> For vnet jails, when you try to describe the network topology, you can
> treat them as VM / physical boxes.
>=20
> I have one box with dozens of vnet jails. Each of them has very single
> responsibility, e.g. DHCP, LADP, pf firewall, OOB access. The topology =
looks quite
> clear and it is easy to maintenance. The only overhead is too much
> hops between the vnet jail instances. For my use case the performance
> is not critical and it works great for years.
>=20
> =20
>=20
> Best regards,
> Zhenlei






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7E65BDDA-7105-4C07-9243-4FAF2B4D5515>