Date: Sun, 15 Sep 2024 18:01:07 +0100 From: Doug Rabson <dfr@rabson.org> To: Sad Clouds <cryintothebluesky@gmail.com> Cc: Zhenlei Huang <zlei@freebsd.org>, Mark Saad <nonesuch@longcount.org>, FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: Performance issues with vnet jails + epair + bridge Message-ID: <CACA0VUjE43FUTaqAtXTur-4akQybKytv-oc1rHxwoUUXM3VQ=Q@mail.gmail.com> In-Reply-To: <20240914112516.cfb31bae68ab90b83ca7ad4b@gmail.com> References: <20240913100938.3eac55c9fbd976fa72d58bb5@gmail.com> <39B2C95D-1E4F-4133-8923-AD305DFA9435@longcount.org> <20240913155439.1e171a88bd01ce9b97558a90@gmail.com> <A95066A8-F5FC-451B-85CE-C463952ABADE@FreeBSD.org> <20240914112516.cfb31bae68ab90b83ca7ad4b@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000a6cf0c06222b6702 Content-Type: text/plain; charset="UTF-8" I just did a throughput test with iperf3 client on a FreeBSD 14.1 host with an intel 10GB nic connecting to an iperf3 server running in a vnet jail on a truenas host (13.something) also with an intel 10GB nic and I get full 10GB throughput in this setup. In the past, I had to disable LRO on the truenas host for this to work properly. Doug. On Sat, 14 Sept 2024 at 11:25, Sad Clouds <cryintothebluesky@gmail.com> wrote: > On Sat, 14 Sep 2024 10:45:03 +0800 > Zhenlei Huang <zlei@FreeBSD.org> wrote: > > > The overhead of vnet jail should be neglectable, compared to legacy jail > > or no-jail. Bare in mind when VIMAGE option is enabled, there is a > default > > vnet 0. It is not visible via jls and can not be destroyed. So when you > see > > bottlenecks, for example this case, it is mostly caused by other > components > > such as if_epair, but not the vnet jail itself. > > Perhaps this needs a correction - the vnet itself may be OK, but due to > a single physical NIC on this appliance, I cannot use vnet jails > without virtualised devices like if_epair(4) and if_bridge(4). I think > there may be other scalability bottlenecks. > > I have a similar setup on Solaris > > Here devel is a Solaris zone with exclusive IP configuration, which I > think may be similar to FreeBSD vnet. It has a virtual NIC devel/net0 > which operates over the physical NIC also called net0 in the global > zone: > > $ dladm > LINK CLASS MTU STATE OVER > net0 phys 1500 up -- > net1 phys 1500 up -- > net2 phys 1500 up -- > net3 phys 1500 up -- > pkgsrc/net0 vnic 1500 up net0 > devel/net0 vnic 1500 up net0 > > If I run TCP bulk data benchmark with 64 concurrent threads, 32 > threads with server process in the global zone and 32 threads with > client process in the devel zone, then the system evenly spreads the > load across all CPU cores and none of them are sitting idle: > > $ mpstat -A core 1 > COR minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys st > idl sze > 0 0 0 2262 2561 4 4744 2085 209 7271 0 747842 272 528 > 0 0 8 > 1 0 0 3187 4209 2 9102 3768 514 10605 0 597012 221 579 > 0 0 8 > 2 0 0 2091 3251 7 6768 2884 307 9557 0 658124 244 556 > 0 0 8 > 3 0 0 1745 1786 16 3494 1520 176 8847 0 746373 273 527 > 0 0 8 > 4 0 0 2797 2767 3 5908 2414 371 7849 0 692873 253 547 > 0 0 8 > 5 0 0 2782 2359 5 4857 2012 324 9431 0 684840 251 549 > 0 0 8 > 6 0 0 4324 4133 0 9138 3592 538 12525 0 516342 191 609 > 0 0 8 > 7 0 0 2180 3249 0 6960 2926 321 8825 0 697861 257 543 > 0 0 8 > > With FreeBSD I tried "options RSS" and increasing "net.isr.maxthreads" > however this resulted in some really flaky kernel behavior. So I'm > thinking that if_epair(4) may be OK for some low-bandwidth use cases, > i.e. testing firewall rules, etc, but not suitable for things like > file/object storage servers, etc. > > --000000000000a6cf0c06222b6702 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">I just did a throughput test with iperf3 client on a FreeB= SD 14.1 host with an intel 10GB nic connecting to an iperf3 server running = in a vnet jail on a truenas host (13.something) also with an intel 10GB nic= and I get full 10GB throughput in this setup. In the past, I had to disabl= e LRO on the truenas host for this to work properly.<div><br></div><div>Dou= g.</div><div><br><div><br></div></div></div><br><div class=3D"gmail_quote">= <div dir=3D"ltr" class=3D"gmail_attr">On Sat, 14 Sept 2024 at 11:25, Sad Cl= ouds <<a href=3D"mailto:cryintothebluesky@gmail.com">cryintothebluesky@g= mail.com</a>> wrote:<br></div><blockquote class=3D"gmail_quote" style=3D= "margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;bor= der-left-color:rgb(204,204,204);padding-left:1ex">On Sat, 14 Sep 2024 10:45= :03 +0800<br> Zhenlei Huang <zlei@FreeBSD.org> wrote:<br> <br> > The overhead of vnet jail should be neglectable, compared to legacy ja= il<br> > or no-jail. Bare in mind when VIMAGE option is enabled, there is a def= ault<br> > vnet 0. It is not visible via jls and can not be destroyed. So when yo= u see<br> > bottlenecks, for example this case, it is mostly caused by other compo= nents<br> > such as if_epair, but not the vnet jail itself.<br> <br> Perhaps this needs a correction - the vnet itself may be OK, but due to<br> a single physical NIC on this appliance, I cannot use vnet jails<br> without virtualised devices like if_epair(4) and if_bridge(4). I think<br> there may be other scalability bottlenecks.<br> <br> I have a similar setup on Solaris<br> <br> Here devel is a Solaris zone with exclusive IP configuration, which I<br> think may be similar to FreeBSD vnet. It has a virtual NIC devel/net0<br> which operates over the physical NIC also called net0 in the global<br> zone:<br> <br> $ dladm<br> LINK=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CLASS=C2=A0 =C2= =A0 =C2=A0MTU=C2=A0 =C2=A0 STATE=C2=A0 =C2=A0 OVER<br> net0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 phys=C2=A0 =C2= =A0 =C2=A0 1500=C2=A0 =C2=A0up=C2=A0 =C2=A0 =C2=A0 =C2=A0--<br> net1=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 phys=C2=A0 =C2= =A0 =C2=A0 1500=C2=A0 =C2=A0up=C2=A0 =C2=A0 =C2=A0 =C2=A0--<br> net2=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 phys=C2=A0 =C2= =A0 =C2=A0 1500=C2=A0 =C2=A0up=C2=A0 =C2=A0 =C2=A0 =C2=A0--<br> net3=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 phys=C2=A0 =C2= =A0 =C2=A0 1500=C2=A0 =C2=A0up=C2=A0 =C2=A0 =C2=A0 =C2=A0--<br> pkgsrc/net0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vnic=C2=A0 =C2=A0 =C2=A0 1500= =C2=A0 =C2=A0up=C2=A0 =C2=A0 =C2=A0 =C2=A0net0<br> devel/net0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 vnic=C2=A0 =C2=A0 =C2=A0 1500= =C2=A0 =C2=A0up=C2=A0 =C2=A0 =C2=A0 =C2=A0net0<br> <br> If I run TCP bulk data benchmark with 64 concurrent threads, 32<br> threads with server process in the global zone and 32 threads with<br> client process in the devel zone, then the system evenly spreads the<br> load across all CPU cores and none of them are sitting idle:<br> <br> $ mpstat -A core 1<br> =C2=A0COR minf mjf xcal=C2=A0 intr ithr=C2=A0 csw icsw migr smtx=C2=A0 srw= =C2=A0 syscl=C2=A0 usr sys=C2=A0 st idl sze<br> =C2=A0 =C2=A00=C2=A0 =C2=A0 0=C2=A0 =C2=A00 2262=C2=A0 2561=C2=A0 =C2=A0 4 = 4744 2085=C2=A0 209 7271=C2=A0 =C2=A0 0 747842=C2=A0 272 528=C2=A0 =C2=A00= =C2=A0 =C2=A00=C2=A0 =C2=A08<br> =C2=A0 =C2=A01=C2=A0 =C2=A0 0=C2=A0 =C2=A00 3187=C2=A0 4209=C2=A0 =C2=A0 2 = 9102 3768=C2=A0 514 10605=C2=A0 =C2=A00 597012=C2=A0 221 579=C2=A0 =C2=A00= =C2=A0 =C2=A00=C2=A0 =C2=A08<br> =C2=A0 =C2=A02=C2=A0 =C2=A0 0=C2=A0 =C2=A00 2091=C2=A0 3251=C2=A0 =C2=A0 7 = 6768 2884=C2=A0 307 9557=C2=A0 =C2=A0 0 658124=C2=A0 244 556=C2=A0 =C2=A00= =C2=A0 =C2=A00=C2=A0 =C2=A08<br> =C2=A0 =C2=A03=C2=A0 =C2=A0 0=C2=A0 =C2=A00 1745=C2=A0 1786=C2=A0 =C2=A016 = 3494 1520=C2=A0 176 8847=C2=A0 =C2=A0 0 746373=C2=A0 273 527=C2=A0 =C2=A00= =C2=A0 =C2=A00=C2=A0 =C2=A08<br> =C2=A0 =C2=A04=C2=A0 =C2=A0 0=C2=A0 =C2=A00 2797=C2=A0 2767=C2=A0 =C2=A0 3 = 5908 2414=C2=A0 371 7849=C2=A0 =C2=A0 0 692873=C2=A0 253 547=C2=A0 =C2=A00= =C2=A0 =C2=A00=C2=A0 =C2=A08<br> =C2=A0 =C2=A05=C2=A0 =C2=A0 0=C2=A0 =C2=A00 2782=C2=A0 2359=C2=A0 =C2=A0 5 = 4857 2012=C2=A0 324 9431=C2=A0 =C2=A0 0 684840=C2=A0 251 549=C2=A0 =C2=A00= =C2=A0 =C2=A00=C2=A0 =C2=A08<br> =C2=A0 =C2=A06=C2=A0 =C2=A0 0=C2=A0 =C2=A00 4324=C2=A0 4133=C2=A0 =C2=A0 0 = 9138 3592=C2=A0 538 12525=C2=A0 =C2=A00 516342=C2=A0 191 609=C2=A0 =C2=A00= =C2=A0 =C2=A00=C2=A0 =C2=A08<br> =C2=A0 =C2=A07=C2=A0 =C2=A0 0=C2=A0 =C2=A00 2180=C2=A0 3249=C2=A0 =C2=A0 0 = 6960 2926=C2=A0 321 8825=C2=A0 =C2=A0 0 697861=C2=A0 257 543=C2=A0 =C2=A00= =C2=A0 =C2=A00=C2=A0 =C2=A08<br> <br> With FreeBSD I tried "options RSS" and increasing "net.isr.m= axthreads"<br> however this resulted in some really flaky kernel behavior. So I'm<br> thinking that if_epair(4) may be OK for some low-bandwidth use cases,<br> i.e. testing firewall rules, etc, but not suitable for things like<br> file/object storage servers, etc.<br> <br> </blockquote></div> --000000000000a6cf0c06222b6702--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACA0VUjE43FUTaqAtXTur-4akQybKytv-oc1rHxwoUUXM3VQ=Q>