Date: Sun, 13 Mar 2022 11:49:49 +0100 From: Michael Gmelin <grembo@freebsd.org> To: Johan Hendriks <joh.hendriks@gmail.com> Cc: Kristof Provost <kp@freebsd.org>, freeBSD-net <freebsd-net@freebsd.org>, ">> \\\\\\\\Patrick M. Hausen\\\\" <hausen@punkt.de> Subject: Re: epair and vnet jail loose connection. Message-ID: <144A3D43-F9CE-492D-85E6-D47D1A47400F@freebsd.org> In-Reply-To: <CAOaKuAXze%2BCWy5MDmDSLZ-2Nt_Bfvww9MmWfuPTJT4HB7PSjdw@mail.gmail.com> References: <CAOaKuAXze%2BCWy5MDmDSLZ-2Nt_Bfvww9MmWfuPTJT4HB7PSjdw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail-0DEA3C08-EBF4-4735-82EC-9D04AF405996 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable > On 13. Mar 2022, at 11:27, Johan Hendriks <joh.hendriks@gmail.com> wrote: > =EF=BB=BF >=20 >=20 > Op zo 13 mrt. 2022 01:17 schreef Michael Gmelin <grembo@freebsd.org>: >> I also gave it another go (this time with multiple CPUs assigned to the v= m), still works just fine - so I think we would need more details about the s= etup. >>=20 >> Would it make sense to share our test setups, so Johan can try to reprodu= ce with them? >>=20 >> -m >>=20 >>> On 13. Mar 2022, at 00:48, Kristof Provost <kp@freebsd.org> wrote: >>> =EF=BB=BF >>> I=E2=80=99m still failing to reproduce. >>>=20 >>> Is pf absolutely required to trigger the issue? Is haproxy (i.e. can you= trigger it with iperf)?=20 >>> Is the bridge strictly required? >>>=20 >>> Kristof >>>=20 >>> On 12 Mar 2022, at 8:18, Johan Hendriks wrote:=20 >>> For me this minimal setup let me see the drop off of the network from th= e haproxy server. >>>=20 >>> 2 jails, one with haproxy, one with nginx which is using the following h= tml file to be served. >>>=20 >>> <!DOCTYPE html> >>> <html> >>> <head> >>> <title>Page Title</title> >>> </head> >>> <body> >>>=20 >>> <h1>My First Heading</h1> >>> <p>My first paragraph.</p> >>>=20 >>> </body> >>> </html> >>>=20 >>> =46rom a remote machine i do a hey -h2 -n 10 -c 10 -z 300s https://wp.t= est.nl >>> Then a ping on the jailhost to the haproxy shows the following >>>=20 >>> [ /] > ping 10.233.185.20 >>> PING 10.233.185.20 (10.233.185.20): 56 data bytes >>> 64 bytes from 10.233.185.20: icmp_seq=3D0 ttl=3D64 time=3D0.054 ms >>> 64 bytes from 10.233.185.20: icmp_seq=3D1 ttl=3D64 time=3D0.050 ms >>> 64 bytes from 10.233.185.20: icmp_seq=3D2 ttl=3D64 time=3D0.041 ms >>> <SNIP> >>> 64 bytes from 10.233.185.20: icmp_seq=3D169 ttl=3D64 time=3D0.050 ms >>> 64 bytes from 10.233.185.20: icmp_seq=3D170 ttl=3D64 time=3D0.154 ms >>> 64 bytes from 10.233.185.20: icmp_seq=3D171 ttl=3D64 time=3D0.054 ms >>> 64 bytes from 10.233.185.20: icmp_seq=3D172 ttl=3D64 time=3D0.039 ms >>> 64 bytes from 10.233.185.20: icmp_seq=3D173 ttl=3D64 time=3D0.160 ms >>> 64 bytes from 10.233.185.20: icmp_seq=3D174 ttl=3D64 time=3D0.045 ms >>> ^C >>> --- 10.233.185.20 ping statistics --- >>> 335 packets transmitted, 175 packets received, 47.8% packet loss >>> round-trip min/avg/max/stddev =3D 0.037/0.070/0.251/0.040 ms >>>=20 >>>=20 >>> ifconfig >>> vtnet0: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> met= ric 0 mtu 1500 >>> options=3D4c00bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HW= CSUM,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6> >>> ether 56:16:e9:80:5e:41 >>> inet 87.233.191.146 netmask 0xfffffff0 broadcast 87.233.191.159 >>> inet 87.233.191.156 netmask 0xffffffff broadcast 87.233.191.156 >>> inet 87.233.191.155 netmask 0xffffffff broadcast 87.233.191.155 >>> inet 87.233.191.154 netmask 0xffffffff broadcast 87.233.191.154 >>> media: Ethernet autoselect (10Gbase-T <full-duplex>) >>> status: active >>> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> >>> vtnet1: flags=3D8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mt= u 1500 >>> options=3D4c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HW= CSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6> >>> ether 56:16:2c:64:32:35 >>> media: Ethernet autoselect (10Gbase-T <full-duplex>) >>> status: active >>> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> >>> lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 >>> options=3D680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> >>> inet6 ::1 prefixlen 128 >>> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 >>> inet 127.0.0.1 netmask 0xff000000 >>> groups: lo >>> nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL> >>> bridge0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 m= tu 1500 >>> ether 58:9c:fc:10:ff:82 >>> inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255 >>> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 >>> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 >>> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 >>> member: epair20a flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> >>> ifmaxaddr 0 port 7 priority 128 path cost 2000 >>> member: epair18a flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> >>> ifmaxaddr 0 port 15 priority 128 path cost 2000 >>> groups: bridge >>> nd6 options=3D9<PERFORMNUD,IFDISABLED> >>> bridge1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 m= tu 1500 >>> ether 58:9c:fc:10:d9:1a >>> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 >>> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 >>> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 >>> member: vtnet0 flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> >>> ifmaxaddr 0 port 1 priority 128 path cost 2000 >>> groups: bridge >>> nd6 options=3D9<PERFORMNUD,IFDISABLED> >>> pflog0: flags=3D141<UP,RUNNING,PROMISC> metric 0 mtu 33160 >>> groups: pflog >>> epair18a: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> m= etric 0 mtu 1500 >>> description: jail_web01 >>> options=3D8<VLAN_MTU> >>> ether 02:77:ea:19:c7:0a >>> groups: epair >>> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) >>> status: active >>> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> >>> epair20a: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> m= etric 0 mtu 1500 >>> description: jail_haproxy >>> options=3D8<VLAN_MTU> >>> ether 02:9b:93:8c:59:0a >>> groups: epair >>> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) >>> status: active >>> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> >>>=20 >>> jail.conf >>>=20 >>> # Global settings applied to all jails. >>> $domain =3D "test.nl"; >>>=20 >>> exec.start =3D "/bin/sh /etc/rc"; >>> exec.stop =3D "/bin/sh /etc/rc.shutdown"; >>> exec.clean; >>>=20 >>> mount.fstab =3D "/storage/jails/$name.fstab"; >>>=20 >>> exec.system_user =3D "root"; >>> exec.jail_user =3D "root"; >>> mount.devfs; >>> sysvshm=3D"new"; >>> sysvsem=3D"new"; >>> allow.raw_sockets; >>> allow.set_hostname =3D 0; >>> allow.sysvipc; >>> enforce_statfs =3D "2"; >>> devfs_ruleset =3D "11"; >>>=20 >>> path =3D "/storage/jails/${name}"; >>> host.hostname =3D "${name}.${domain}"; >>>=20 >>>=20 >>> # Networking >>> vnet; >>> vnet.interface =3D "vnet0"; >>>=20 >>> # Commands to run on host before jail is created >>> exec.prestart =3D "ifconfig epair${ip} create up description jail_${n= ame}"; >>> exec.prestart +=3D "ifconfig epair${ip}a up"; >>> exec.prestart +=3D "ifconfig bridge0 addm epair${ip}a up"; >>> exec.created =3D "ifconfig epair${ip}b name vnet0"; >>>=20 >>> # Commands to run in jail after it is created >>> exec.start +=3D "/bin/sh /etc/rc"; >>>=20 >>> # commands to run in jail when jail is stopped >>> exec.stop =3D "/bin/sh /etc/rc.shutdown"; >>>=20 >>> # Commands to run on host when jail is stopped >>> exec.poststop =3D "ifconfig bridge0 deletem epair${ip}a"; >>> exec.poststop +=3D "ifconfig epair${ip}a destroy"; >>> persist; >>>=20 >>> web01 { >>> $ip =3D 18; >>> } >>>=20 >>> haproxy { >>> $ip =3D 20; >>> mount.fstab =3D ""; >>> path =3D "/storage/jails/${name}"; >>> } >>>=20 >>> pf.conf >>>=20 >>> ####################################################################### >>> ext_if=3D"vtnet0" >>> table <bruteforcers> persist >>> table <torlist> persist >>> table <ssh-trusted> persist file "/usr/local/etc/pf/ssh-trusted" >>> table <custom-block> persist file "/usr/local/etc/pf/custom-block" >>> table <jailnetworks> { 10.233.185.0/24, 192.168.10.0/24 } >>>=20 >>> icmp_types =3D "echoreq" >>> junk_ports=3D"{ 135,137,138,139,445,68,67,3222,17500 }" >>>=20 >>> # Log interface >>> set loginterface $ext_if >>>=20 >>> # Set limits >>> set limit { states 40000, frags 20000, src-nodes 20000 } >>>=20 >>> scrub on $ext_if all fragment reassemble no-df random-id >>>=20 >>> # ---- Nat jails to the web >>> binat on $ext_if from 10.233.185.15/32 to !10.233.185.0/24 -> 87.233.191= .156/32 # saltmaste >>> binat on $ext_if from 10.233.185.20/32 to !10.233.185.0/24 -> 87.233.191= .155/32 # haproxy >>> binat on $ext_if from 10.233.185.22/32 to !10.233.185.0/24 -> 87.233.191= .154/32 # web-comb >>>=20 >>> nat on $ext_if from <jailnetworks> to any -> ($ext_if:0) >>>=20 >>> # ---- First rule obligatory "Pass all on loopback" >>> pass quick on lo0 all >>> pass quick on bridge0 all >>> pass quick on bridge1 all >>>=20 >>> # ---- Block TOR exit addresses >>> block quick proto { tcp, udp } from <torlist> to $ext_if >>>=20 >>> # ---- Second rule "Block all in and pass all out" >>> block in log all >>> pass out all keep state >>>=20 >>> # IPv6 pass in/out all IPv6 ICMP traffic >>> pass in quick proto icmp6 all >>>=20 >>> # Pass all lo0 >>> set skip on lo0 >>>=20 >>> ############### FIREWALL ###############################################= >>> # ---- Block custom ip's and logs >>> block quick proto { tcp, udp } from <custom-block> to $ext_if >>>=20 >>> # ---- Jail poorten >>> pass in quick on { $ext_if } proto tcp from any to 10.233.185.22 port { s= mtp 80 443 993 995 1956 } keep state >>> pass in quick on { $ext_if } proto tcp from any to 10.233.185.20 port { s= mtp 80 443 993 995 1956 } keep state >>> pass in quick on { $ext_if } proto tcp from any to 10.233.185.15 port { 4= 505 4506 } keep state >>>=20 >>> # ---- Allow ICMP >>> pass in inet proto icmp all icmp-type $icmp_types keep state >>> pass out inet proto icmp all icmp-type $icmp_types keep state >>>=20 >>> pass in quick on $ext_if inet proto tcp from any to $ext_if port { 80, 4= 43 } flags S/SA keep state >>> pass in quick on $ext_if inet proto tcp from <ssh-trusted> to $ext_if po= rt { 4505 4506 } flags S/SA keep state >>> block log quick from <bruteforcers> >>> pass quick proto tcp from <ssh-trusted> to $ext_if port ssh flags S/SA k= eep state >>>=20 >>> This is as minimal i can get it. >>>=20 >>> Hope this helps. >>> regards, >>> Johan Hendriks >>>=20 >>>=20 >>> Op za 12 mrt. 2022 om 02:10 schreef Kristof Provost <kp@freebsd.org>: >>>> On 11 Mar 2022, at 18:55, Michael Gmelin wrote: >>>> >> On 12. Mar 2022, at 01:21, Kristof Provost <kp@freebsd.org> wrote: >>>> >> >>>> >> =EF=BB=BFOn 11 Mar 2022, at 17:44, Johan Hendriks wrote: >>>> >>>> On 09/03/2022 20:55, Johan Hendriks wrote: >>>> >>>> The problem: >>>> >>>> I have a FreeBSD 14 machine and a FreeBSD 13-stable machine, both r= unning the same jails just to test the workings. >>>> >>>> >>>> >>>> The jails that are running are a salt master, a haproxy jail, 2 w= ebservers, 2 varnish servers, 2 php jails one for php8.0 and one with 8.1. A= ll the jails are connected to bridge0 and all the jails use vnet. >>>> >>>> >>>> >>>> I believe this worked on an older 14-HEAD machine, but i did not d= o a lot with it back then, and when i started testing again and after updati= ng the OS i noticed that one of the varnish jails lost it's network connecti= on after running for a few hours. I thought it was just something on HEAD so= never really looked at it. But later on when i start using the jails again a= nd testing a test wordpress site i noticed that with a simple load test my h= aproxy jail within one minute looses it's network connection. I see nothing i= n the logs, on the host and on the jail. >>>> >>>> =46rom the jail i can not ping the other jails or the IP adres of t= he bridge. I can however ping the jails own IP adres. =46rom the host i can a= lso not ping the haproxy jail IP adres. If i start a tcpdump on the epaira i= nterface from the haproxy jail i do see the packets arrive but not in the ja= il. >>>> >>>> >>>> >>>> I used ZFS to send all the jails to a 13-STABLE machine and copied= over the jail.conf file as well as the pf.conf file and i saw the same beha= vior. >>>> >>>> >>>> >>>> Then i tried to use 13.0-RELEASE-p7 and on that machine i do not s= ee this happening. There i can stress test the machine for 10 minutes withou= t a problem but on 14-HEAD and 13-STABLE within a minute the jail's network c= onnection fails and only a restart of the jail brings it back online to exhi= bit the same behavior if i start a simple load test which it should handle n= icely. >>>> >>>> >>>> >>>> One of the jail hosts is running under VMWARE and the other is run= ning under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is running under U= buntu with KVM >>>> >>>> >>>> >>>> Thank you for your time. >>>> >>>> regards >>>> >>>> Johan >>>> >>>> >>>> >>> I did some bisecting and the latest commit that works on FreeBSD 13= -Stable is 009a56b2e >>>> >>> Then the commit 2e0bee4c7 if_epair: implement fanout and above is s= howing the symptoms described above. >>>> >>> >>>> >> Interestingly I cannot reproduce stalls in simple epair setups. >>>> >> It would be useful if you could reduce the setup with the problem in= to a minimal configuration so we can figure out what other factors are invol= ved. >>>> > >>>> > If there are clear instructions on how to reproduce, I=E2=80=99m happ= y to help experimenting (I=E2=80=99m relying heavily on epair at this point)= . >>>> > >>>> > @Kristof: Did you try on bare metal or on vms? >>>> > >>>> Both. >>>>=20 >>>> Kristof > I also did do a new install, this time based on 13.1-PRERELEASE. > Copyd my haproxy en web01 jail to this machine and have the same problem.=20= >=20 > Could it be a sysctl i use? or boot/loader.conf setting. >=20 > this is my /boot/loader.conf > # -- sysinstall generated deltas -- # >=20 > autoboot_delay=3D"2" #optional >=20 > cryptodev_load=3D"YES" >=20 > vbe_max_resolution=3D1024x768 >=20 > # disable hyperthreading > machdep.hyperthreading_allowed=3D0 >=20 > # filemon > filemon_load=3D"YES" >=20 > # use gpt ids instead of gptids or disks idents > kern.geom.label.disk_ident.enable=3D"0" > kern.geom.label.gpt.enable=3D"1" > kern.geom.label.gptid.enable=3D"0" >=20 > # ZFS > zfs_load=3D"YES" >=20 > My /etc/sysctl.conf >=20 > # $FreeBSD$ > # > # This file is read when going to multi-user and its contents piped thru > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for details.= > # > kern.timecounter.hardware=3DHPET > # accept queue > kern.ipc.soacceptqueue=3D4096 >=20 > # PF vnet jail > net.link.bridge.pfil_member=3D0 > net.link.bridge.pfil_bridge=3D0 > net.inet.ip.forwarding=3D1 # (default 0) > net.inet.tcp.tso=3D0 # (default 1) > vfs.zfs.min_auto_ashift=3D12 >=20 > I f you want i can give you full root access on this machine.=20 >=20 > I do use a machine outside of the host machine to do the hey command. The h= ost file points to the alias which is binat for the haproxy jail. >=20 > Thank you all for your time on this! >=20 > regards > Johan Hendriks >=20 Hi Johan, Two questions from one of my previous emails: 1. How is web01 configured (I created a full jail for it like haproxy, as it= was unclear to me) 2. > devfs_ruleset =3D "11"; What is in devfs_ruleset 11? (it's not a standard one), I used "4" in my tests. Root access might help as well, if we continue to not be able to reproduce. Cheers Michael --Apple-Mail-0DEA3C08-EBF4-4735-82EC-9D04AF405996 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><head><meta http-equiv=3D"content-type" content=3D"text/html; charset=3D= utf-8"></head><body dir=3D"auto"><div dir=3D"ltr"><meta http-equiv=3D"conten= t-type" content=3D"text/html; charset=3Dutf-8"><div dir=3D"ltr"></div><div d= ir=3D"ltr"><br></div><div dir=3D"ltr"><br><blockquote type=3D"cite">On 13. M= ar 2022, at 11:27, Johan Hendriks <joh.hendriks@gmail.com> wrote:<br><= br></blockquote></div><blockquote type=3D"cite"><div dir=3D"ltr">=EF=BB=BF<d= iv dir=3D"ltr"><div dir=3D"auto"><br></div><br><div class=3D"gmail_quote"><d= iv dir=3D"ltr" class=3D"gmail_attr">Op zo 13 mrt. 2022 01:17 schreef Michael= Gmelin <<a href=3D"mailto:grembo@freebsd.org" target=3D"_blank">grembo@f= reebsd.org</a>>:<br></div><blockquote class=3D"gmail_quote" style=3D"marg= in:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"auto"= ><div dir=3D"ltr"></div><div dir=3D"ltr">I also gave it another go (this tim= e with multiple CPUs assigned to the vm), still works just fine - so I think= we would need more details about the setup.</div><div dir=3D"ltr"><br></div= ><div dir=3D"ltr">Would it make sense to share our test setups, so Johan can= try to reproduce with them?</div><div dir=3D"ltr"><br></div><div dir=3D"ltr= ">-m</div><div dir=3D"ltr"><br><blockquote type=3D"cite">On 13. Mar 2022, at= 00:48, Kristof Provost <<a href=3D"mailto:kp@freebsd.org" rel=3D"norefer= rer" target=3D"_blank">kp@freebsd.org</a>> wrote:<br><br></blockquote></d= iv><blockquote type=3D"cite"><div dir=3D"ltr">=EF=BB=BF <div style=3D"font-family:sans-serif"><div style=3D"white-space:normal"><p d= ir=3D"auto">I=E2=80=99m still failing to reproduce.</p> <p dir=3D"auto">Is pf absolutely required to trigger the issue? Is haproxy (= i.e. can you trigger it with iperf)? <br> Is the bridge strictly required?</p> <p dir=3D"auto">Kristof</p> <p dir=3D"auto">On 12 Mar 2022, at 8:18, Johan Hendriks wrote: <br> </p></div><blockquote style=3D"margin:0 0 5px;padding-left:5px;border-left:2= px solid #136bce;color:#136bce"><div id=3D"m_172741133258683697m_51860260667= 63393364F15475DE-793E-4A29-95C3-2EA5B501E738"> <div dir=3D"ltr">For me this minimal setup let me see the drop off of the ne= twork from the haproxy server.<br> <br> 2 jails, one with haproxy, one with nginx which is using the following html f= ile to be served.<br> <br> <!DOCTYPE html><br> <html><br> <head><br> <title>Page Title</title><br> </head><br> <body><br> <br> <h1>My First Heading</h1><br> <p>My first paragraph.</p><br> <br> </body><br> </html><br> <br> =46rom a remote machine i do a hey -h2 -n 10 -c 10 -z 300s <a hre= f=3D"https://wp.test.nl" rel=3D"noreferrer" target=3D"_blank">https://wp.tes= t.nl</a><br> Then a ping on the jailhost to the haproxy shows the following<br> <br> [ /] > ping 10.233.185.20<br> PING 10.233.185.20 (10.233.185.20): 56 data bytes<br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D0 ttl=3D64 time=3D0.054 ms<br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D1 ttl=3D64 time=3D0.050 ms<br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D2 ttl=3D64 time=3D0.041 ms<br> <SNIP><br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D169 ttl=3D64 time=3D0.050 ms<br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D170 ttl=3D64 time=3D0.154 ms<br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D171 ttl=3D64 time=3D0.054 ms<br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D172 ttl=3D64 time=3D0.039 ms<br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D173 ttl=3D64 time=3D0.160 ms<br> 64 bytes from <a href=3D"http://10.233.185.20" rel=3D"noreferrer" target=3D"= _blank">10.233.185.20</a>: icmp_seq=3D174 ttl=3D64 time=3D0.045 ms<br> ^C<br> --- 10.233.185.20 ping statistics ---<br> 335 packets transmitted, 175 packets received, 47.8% packet loss<br> round-trip min/avg/max/stddev =3D 0.037/0.070/0.251/0.040 ms<br> <br> <br> ifconfig<br> vtnet0: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> m= etric 0 mtu 1500<br> options=3D4c00bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWC= SUM,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6><br> ether 56:16:e9:80:5e:41<br> inet 87.233.191.146 netmask 0xfffffff0 broadcast 87.233.191.159<br> inet 87.233.191.156 netmask 0xffffffff broadcast 87.233.191.156<br> inet 87.233.191.155 netmask 0xffffffff broadcast 87.233.191.155<br> inet 87.233.191.154 netmask 0xffffffff broadcast 87.233.191.154<br> media: Ethernet autoselect (10Gbase-T <full-duplex>)<br> status: active<br> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL><br> vtnet1: flags=3D8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 m= tu 1500<br> options=3D4c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWC= SUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6><br> ether 56:16:2c:64:32:35<br> media: Ethernet autoselect (10Gbase-T <full-duplex>)<br> status: active<br> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL><br> lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384<br= > options=3D680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6><br> inet6 ::1 prefixlen 128<br> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3<br> inet 127.0.0.1 netmask 0xff000000<br> groups: lo<br> nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL><br> bridge0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0= mtu 1500<br> ether 58:9c:fc:10:ff:82<br> inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255<br> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15<br> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200<br> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0<br> member: epair20a flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP><br> ifmaxaddr 0 port 7 priority 128 path cost 2000<br= > member: epair18a flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP><br> ifmaxaddr 0 port 15 priority 128 path cost 2000<b= r> groups: bridge<br> nd6 options=3D9<PERFORMNUD,IFDISABLED><br> bridge1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0= mtu 1500<br> ether 58:9c:fc:10:d9:1a<br> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15<br> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200<br> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0<br> member: vtnet0 flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP><br> ifmaxaddr 0 port 1 priority 128 path cost 2000<br= > groups: bridge<br> nd6 options=3D9<PERFORMNUD,IFDISABLED><br> pflog0: flags=3D141<UP,RUNNING,PROMISC> metric 0 mtu 33160<br> groups: pflog<br> epair18a: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>= metric 0 mtu 1500<br> description: jail_web01<br> options=3D8<VLAN_MTU><br> ether 02:77:ea:19:c7:0a<br> groups: epair<br> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)<br> status: active<br> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL><br> epair20a: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>= metric 0 mtu 1500<br> description: jail_haproxy<br> options=3D8<VLAN_MTU><br> ether 02:9b:93:8c:59:0a<br> groups: epair<br> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)<br> status: active<br> nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL><br> <br> jail.conf<br> <br> # Global settings applied to all jails.<br> $domain =3D "<a href=3D"http://test.nl" rel=3D"noreferrer" target=3D"_blank"= >test.nl</a>";<br> <br> exec.start =3D "/bin/sh /etc/rc";<br> exec.stop =3D "/bin/sh /etc/rc.shutdown";<br> exec.clean;<br> <br> mount.fstab =3D "/storage/jails/$name.fstab";<br> <br> exec.system_user =3D "root";<br> exec.jail_user =3D "root";<br> mount.devfs;<br> sysvshm=3D"new";<br> sysvsem=3D"new";<br> allow.raw_sockets;<br> allow.set_hostname =3D 0;<br> allow.sysvipc;<br> enforce_statfs =3D "2";<br> devfs_ruleset =3D "11";<br> <br> path =3D "/storage/jails/${name}";<br> host.hostname =3D "${name}.${domain}";<br> <br> <br> # Networking<br> vnet;<br> vnet.interface =3D "vnet0";<br> <br> # Commands to run on host before jail is created<br> exec.prestart =3D "ifconfig epair${ip} create up description ja= il_${name}";<br> exec.prestart +=3D "ifconfig epair${ip}a up";<br> exec.prestart +=3D "ifconfig bridge0 addm epair${ip}a up";<br> exec.created =3D "ifconfig epair${ip}b name vnet0";<br> <br> # Commands to run in jail after it is created<br> exec.start +=3D "/bin/sh /etc/rc";<br> <br> # commands to run in jail when jail is stopped<br> exec.stop =3D "/bin/sh /etc/rc.shutdown";<br> <br> # Commands to run on host when jail is stopped<br> exec.poststop =3D "ifconfig bridge0 deletem epair${ip}a";<br> exec.poststop +=3D "ifconfig epair${ip}a destroy";<br> persist;<br> <br> web01 {<br> $ip =3D 18;<br> }<br> <br> haproxy {<br> $ip =3D 20;<br> mount.fstab =3D "";<br> path =3D "/storage/jails/${name}";<br> }<br> <br> pf.conf<br> <br> #######################################################################<br> ext_if=3D"vtnet0"<br> table <bruteforcers> persist<br> table <torlist> persist<br> table <ssh-trusted> persist file "/usr/local/etc/pf/ssh-trusted"<br> table <custom-block> persist file "/usr/local/etc/pf/custom-block"<br>= table <jailnetworks> { <a href=3D"http://10.233.185.0/24" rel=3D"noref= errer" target=3D"_blank">10.233.185.0/24</a>, <a href=3D"http://192.168.10.0= /24" rel=3D"noreferrer" target=3D"_blank">192.168.10.0/24</a> }<br> <br> icmp_types =3D "echoreq"<br> junk_ports=3D"{ 135,137,138,139,445,68,67,3222,17500 }"<br> <br> # Log interface<br> set loginterface $ext_if<br> <br> # Set limits<br> set limit { states 40000, frags 20000, src-nodes 20000 }<br> <br> scrub on $ext_if all fragment reassemble no-df random-id<br> <br> # ---- Nat jails to the web<br> binat on $ext_if from <a href=3D"http://10.233.185.15/32" rel=3D"noreferrer"= target=3D"_blank">10.233.185.15/32</a> to !<a href=3D"http://10.233.185.0/2= 4" rel=3D"noreferrer" target=3D"_blank">10.233.185.0/24</a> -> <a href=3D= "http://87.233.191.156/32" rel=3D"noreferrer" target=3D"_blank">87.233.191.1= 56/32</a> # saltmaste<br> binat on $ext_if from <a href=3D"http://10.233.185.20/32" rel=3D"noreferrer"= target=3D"_blank">10.233.185.20/32</a> to !<a href=3D"http://10.233.185.0/2= 4" rel=3D"noreferrer" target=3D"_blank">10.233.185.0/24</a> -> <a href=3D= "http://87.233.191.155/32" rel=3D"noreferrer" target=3D"_blank">87.233.191.1= 55/32</a> # haproxy<br> binat on $ext_if from <a href=3D"http://10.233.185.22/32" rel=3D"noreferrer"= target=3D"_blank">10.233.185.22/32</a> to !<a href=3D"http://10.233.185.0/2= 4" rel=3D"noreferrer" target=3D"_blank">10.233.185.0/24</a> -> <a href=3D= "http://87.233.191.154/32" rel=3D"noreferrer" target=3D"_blank">87.233.191.1= 54/32</a> # web-comb<br> <br> nat on $ext_if from <jailnetworks> to any -> ($ext_if:0)<br> <br> # ---- First rule obligatory "Pass all on loopback"<br> pass quick on lo0 all<br> pass quick on bridge0 all<br> pass quick on bridge1 all<br> <br> # ---- Block TOR exit addresses<br> block quick proto { tcp, udp } from <torlist> to $ext_if<br> <br> # ---- Second rule "Block all in and pass all out"<br> block in log all<br> pass out all keep state<br> <br> # IPv6 pass in/out all IPv6 ICMP traffic<br> pass in quick proto icmp6 all<br> <br> # Pass all lo0<br> set skip on lo0<br> <br> ############### FIREWALL ###############################################<br>= # ---- Block custom ip's and logs<br> block quick proto { tcp, udp } from <custom-block> to $ext_if<br> <br> # ---- Jail poorten<br> pass in quick on { $ext_if } proto tcp from any to 10.233.185.22 port { smtp= 80 443 993 995 1956 } keep state<br> pass in quick on { $ext_if } proto tcp from any to 10.233.185.20 port { smtp= 80 443 993 995 1956 } keep state<br> pass in quick on { $ext_if } proto tcp from any to 10.233.185.15 port { 4505= 4506 } keep state<br> <br> # ---- Allow ICMP<br> pass in inet proto icmp all icmp-type $icmp_types keep state<br> pass out inet proto icmp all icmp-type $icmp_types keep state<br> <br> pass in quick on $ext_if inet proto tcp from any to $ext_if port { 80, 443 }= flags S/SA keep state<br> pass in quick on $ext_if inet proto tcp from <ssh-trusted> to $ext_if p= ort { 4505 4506 } flags S/SA keep state<br> block log quick from <bruteforcers><br> pass quick proto tcp from <ssh-trusted> to $ext_if port ssh flags S/SA= keep state<br> <br> This is as minimal i can get it.<br> <br> Hope this helps.<br> regards,<br> Johan Hendriks<br> <br></div> <br> <div class=3D"gmail_quote"> <div dir=3D"ltr" class=3D"gmail_attr">Op za 12 mrt. 2022 om 02:10 schreef Kr= istof Provost <<a href=3D"mailto:kp@freebsd.org" rel=3D"noreferrer" targe= t=3D"_blank">kp@freebsd.org</a>>:<br></div> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">On 11 Mar 2022, at 18:55, M= ichael Gmelin wrote:<br> >> On 12. Mar 2022, at 01:21, Kristof Provost <<a href=3D"mailto:kp= @freebsd.org" rel=3D"noreferrer" target=3D"_blank">kp@freebsd.org</a>> wr= ote:<br> >><br> >> =EF=BB=BFOn 11 Mar 2022, at 17:44, Johan Hendriks wrote:<br> >>>> On 09/03/2022 20:55, Johan Hendriks wrote:<br> >>>> The problem:<br> >>>> I have a FreeBSD 14 machine and a FreeBSD 13-stable machine= , both running the same jails just to test the workings.<br> >>>><br> >>>> The jails that are running are a salt master, a haproxy&nbs= p; jail, 2 webservers, 2 varnish servers, 2 php jails one for php8.0 and one= with 8.1. All the jails are connected to bridge0 and all the jails use vnet= .<br> >>>><br> >>>> I believe this worked on an older 14-HEAD machine, but i di= d not do a lot with it back then, and when i started testing again and after= updating the OS i noticed that one of the varnish jails lost it's network c= onnection after running for a few hours. I thought it was just something on H= EAD so never really looked at it. But later on when i start using the jails a= gain and testing a test wordpress site i noticed that with a simple load tes= t my haproxy jail within one minute looses it's network connection. I see no= thing in the logs, on the host and on the jail.<br> >>>> =46rom the jail i can not ping the other jails or the IP ad= res of the bridge. I can however ping the jails own IP adres. =46rom the hos= t i can also not ping the haproxy jail IP adres. If i start a tcpdump on the= epaira interface from the haproxy jail i do see the packets arrive but not i= n the jail.<br> >>>><br> >>>> I used ZFS to send all the jails to a 13-STABLE machine and= copied over the jail.conf file as well as the pf.conf file and i saw the sa= me behavior.<br> >>>><br> >>>> Then i tried to use 13.0-RELEASE-p7 and on that machine i d= o not see this happening. There i can stress test the machine for 10 minutes= without a problem but on 14-HEAD and 13-STABLE within a minute the jail's n= etwork connection fails and only a restart of the jail brings it back online= to exhibit the same behavior if i start a simple load test which it should h= andle nicely.<br> >>>><br> >>>> One of the jail hosts is running under VMWARE and the other= is running under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is running u= nder Ubuntu with KVM<br> >>>><br> >>>> Thank you for your time.<br> >>>> regards<br> >>>> Johan<br> >>>><br> >>> I did some bisecting and the latest commit that works on FreeBS= D 13-Stable is 009a56b2e<br> >>> Then the commit 2e0bee4c7 if_epair: implement fanout and a= bove is showing the symptoms described above.<br> >>><br> >> Interestingly I cannot reproduce stalls in simple epair setups.<br>= >> It would be useful if you could reduce the setup with the problem i= nto a minimal configuration so we can figure out what other factors are invo= lved.<br> ><br> > If there are clear instructions on how to reproduce, I=E2=80=99m happy t= o help experimenting (I=E2=80=99m relying heavily on epair at this point).<b= r> ><br> > @Kristof: Did you try on bare metal or on vms?<br> ><br> Both.<br> <br> Kristof<br></blockquote></div></div></blockquote></div></div></blockquote></= div></blockquote><div>I also did do a new install, this time based on 13.1-P= RERELEASE.<br>Copyd my haproxy en web01 jail to this machine and have the sa= me problem. <br><br>Could it be a sysctl i use? or boot/loader.conf set= ting.<br><br>this is my /boot/loader.conf<br># -- sysinstall generated delta= s -- #<br><br>autoboot_delay=3D"2" #optional<br><br>cryptodev_load=3D"= YES"<br><br>vbe_max_resolution=3D1024x768<br><br># disable hyperthreading<br= >machdep.hyperthreading_allowed=3D0<br><br># filemon<br>filemon_load=3D"YES"= <br><br># use gpt ids instead of gptids or disks idents<br>kern.geom.label.d= isk_ident.enable=3D"0"<br>kern.geom.label.gpt.enable=3D"1"<br>kern.geom.labe= l.gptid.enable=3D"0"<br><br># ZFS<br>zfs_load=3D"YES"<br><br>My /etc/sysctl.= conf<br><br># $FreeBSD$<br>#<br># This file is read when going to mult= i-user and its contents piped thru<br># ``sysctl'' to adjust kernel va= lues. ``man 5 sysctl.conf'' for details.<br>#<br>kern.timecounter.hard= ware=3DHPET<br># accept queue<br>kern.ipc.soacceptqueue=3D4096<br><br># PF v= net jail<br>net.link.bridge.pfil_member=3D0<br>net.link.bridge.pfil_bridge=3D= 0<br>net.inet.ip.forwarding=3D1 &n= bsp; # (default 0)<br>net.inet.tcp.tso=3D0 # (default 1)= <br>vfs.zfs.min_auto_ashift=3D12<br><br>I f you want i can give you full roo= t access on this machine. <br><br>I do use a machine outside of the hos= t machine to do the hey command. The host file points to the alias which is b= inat for the haproxy jail.<br><br>Thank you all for your time on this!<br><b= r>regards<br>Johan Hendriks<br><br></div></div></div></div></blockquote><div= ><br></div><div>Hi Johan,</div><div><br></div><div>Two questions from one of= my previous emails:</div><div><br></div><div>1. How is web01 configured (I c= reated a full jail for it like haproxy, as it was unclear to me)</div><div>2= .</div><div><br></div><div><blockquote type=3D"cite" style=3D"caret-color: r= gb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">devfs_rul= eset =3D "11";<br></blockquote><span style=3D"-webki= t-text-size-adjust: auto;"></span><br style=3D"-webkit-text-size-adjust: aut= o;"><span style=3D"-webkit-text-size-adjust: auto;">What is in devfs_ruleset= 11? (it's not a standard one), I used "4" in</span><br style=3D"-webkit-tex= t-size-adjust: auto;"><span style=3D"-webkit-text-size-adjust: auto;">my tes= ts.</span></div><div><br></div><div>Root access might help as well, if we co= ntinue to not be able to reproduce.</div><div><br></div><div>Cheers</div><di= v>Michael</div><div><br></div><div><br></div><div><br></div><br><blockquote t= ype=3D"cite"><div dir=3D"ltr"><div dir=3D"ltr"><div class=3D"gmail_quote"><d= iv><br><br><br><br><br> </div></div> </div> </div></blockquote></div></body></html>= --Apple-Mail-0DEA3C08-EBF4-4735-82EC-9D04AF405996--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?144A3D43-F9CE-492D-85E6-D47D1A47400F>