Date: Sun, 13 Mar 2022 01:17:07 +0100 From: Michael Gmelin <grembo@freebsd.org> To: Kristof Provost <kp@freebsd.org> Cc: Johan Hendriks <joh.hendriks@gmail.com>, freebsd-net@freebsd.org, ">> \\\\\\\\Patrick M. Hausen\\\\" <hausen@punkt.de> Subject: Re: epair and vnet jail loose connection. Message-ID: <95793CDF-6E72-4FAB-8BF5-F2E67D3F69CD@freebsd.org> In-Reply-To: <94B8885D-F63F-40C3-9E7E-158CC252FF9A@FreeBSD.org> References: <94B8885D-F63F-40C3-9E7E-158CC252FF9A@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] I also gave it another go (this time with multiple CPUs assigned to the vm), still works just fine - so I think we would need more details about the setup. Would it make sense to share our test setups, so Johan can try to reproduce with them? -m > On 13. Mar 2022, at 00:48, Kristof Provost <kp@freebsd.org> wrote: > > > I’m still failing to reproduce. > > Is pf absolutely required to trigger the issue? Is haproxy (i.e. can you trigger it with iperf)? > Is the bridge strictly required? > > Kristof > > On 12 Mar 2022, at 8:18, Johan Hendriks wrote: > For me this minimal setup let me see the drop off of the network from the haproxy server. > > 2 jails, one with haproxy, one with nginx which is using the following html file to be served. > > <!DOCTYPE html> > <html> > <head> > <title>Page Title</title> > </head> > <body> > > <h1>My First Heading</h1> > <p>My first paragraph.</p> > > </body> > </html> > > From a remote machine i do a hey -h2 -n 10 -c 10 -z 300s https://wp.test.nl > Then a ping on the jailhost to the haproxy shows the following > > [ /] > ping 10.233.185.20 > PING 10.233.185.20 (10.233.185.20): 56 data bytes > 64 bytes from 10.233.185.20: icmp_seq=0 ttl=64 time=0.054 ms > 64 bytes from 10.233.185.20: icmp_seq=1 ttl=64 time=0.050 ms > 64 bytes from 10.233.185.20: icmp_seq=2 ttl=64 time=0.041 ms > <SNIP> > 64 bytes from 10.233.185.20: icmp_seq=169 ttl=64 time=0.050 ms > 64 bytes from 10.233.185.20: icmp_seq=170 ttl=64 time=0.154 ms > 64 bytes from 10.233.185.20: icmp_seq=171 ttl=64 time=0.054 ms > 64 bytes from 10.233.185.20: icmp_seq=172 ttl=64 time=0.039 ms > 64 bytes from 10.233.185.20: icmp_seq=173 ttl=64 time=0.160 ms > 64 bytes from 10.233.185.20: icmp_seq=174 ttl=64 time=0.045 ms > ^C > --- 10.233.185.20 ping statistics --- > 335 packets transmitted, 175 packets received, 47.8% packet loss > round-trip min/avg/max/stddev = 0.037/0.070/0.251/0.040 ms > > > ifconfig > vtnet0: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 > options=4c00bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6> > ether 56:16:e9:80:5e:41 > inet 87.233.191.146 netmask 0xfffffff0 broadcast 87.233.191.159 > inet 87.233.191.156 netmask 0xffffffff broadcast 87.233.191.156 > inet 87.233.191.155 netmask 0xffffffff broadcast 87.233.191.155 > inet 87.233.191.154 netmask 0xffffffff broadcast 87.233.191.154 > media: Ethernet autoselect (10Gbase-T <full-duplex>) > status: active > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> > vtnet1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > options=4c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6> > ether 56:16:2c:64:32:35 > media: Ethernet autoselect (10Gbase-T <full-duplex>) > status: active > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 > options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> > inet6 ::1 prefixlen 128 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 > inet 127.0.0.1 netmask 0xff000000 > groups: lo > nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> > bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > ether 58:9c:fc:10:ff:82 > inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255 > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > member: epair20a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> > ifmaxaddr 0 port 7 priority 128 path cost 2000 > member: epair18a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> > ifmaxaddr 0 port 15 priority 128 path cost 2000 > groups: bridge > nd6 options=9<PERFORMNUD,IFDISABLED> > bridge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > ether 58:9c:fc:10:d9:1a > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > member: vtnet0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> > ifmaxaddr 0 port 1 priority 128 path cost 2000 > groups: bridge > nd6 options=9<PERFORMNUD,IFDISABLED> > pflog0: flags=141<UP,RUNNING,PROMISC> metric 0 mtu 33160 > groups: pflog > epair18a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 > description: jail_web01 > options=8<VLAN_MTU> > ether 02:77:ea:19:c7:0a > groups: epair > media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) > status: active > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> > epair20a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 > description: jail_haproxy > options=8<VLAN_MTU> > ether 02:9b:93:8c:59:0a > groups: epair > media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) > status: active > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> > > jail.conf > > # Global settings applied to all jails. > $domain = "test.nl"; > > exec.start = "/bin/sh /etc/rc"; > exec.stop = "/bin/sh /etc/rc.shutdown"; > exec.clean; > > mount.fstab = "/storage/jails/$name.fstab"; > > exec.system_user = "root"; > exec.jail_user = "root"; > mount.devfs; > sysvshm="new"; > sysvsem="new"; > allow.raw_sockets; > allow.set_hostname = 0; > allow.sysvipc; > enforce_statfs = "2"; > devfs_ruleset = "11"; > > path = "/storage/jails/${name}"; > host.hostname = "${name}.${domain}"; > > > # Networking > vnet; > vnet.interface = "vnet0"; > > # Commands to run on host before jail is created > exec.prestart = "ifconfig epair${ip} create up description jail_${name}"; > exec.prestart += "ifconfig epair${ip}a up"; > exec.prestart += "ifconfig bridge0 addm epair${ip}a up"; > exec.created = "ifconfig epair${ip}b name vnet0"; > > # Commands to run in jail after it is created > exec.start += "/bin/sh /etc/rc"; > > # commands to run in jail when jail is stopped > exec.stop = "/bin/sh /etc/rc.shutdown"; > > # Commands to run on host when jail is stopped > exec.poststop = "ifconfig bridge0 deletem epair${ip}a"; > exec.poststop += "ifconfig epair${ip}a destroy"; > persist; > > web01 { > $ip = 18; > } > > haproxy { > $ip = 20; > mount.fstab = ""; > path = "/storage/jails/${name}"; > } > > pf.conf > > ####################################################################### > ext_if="vtnet0" > table <bruteforcers> persist > table <torlist> persist > table <ssh-trusted> persist file "/usr/local/etc/pf/ssh-trusted" > table <custom-block> persist file "/usr/local/etc/pf/custom-block" > table <jailnetworks> { 10.233.185.0/24, 192.168.10.0/24 } > > icmp_types = "echoreq" > junk_ports="{ 135,137,138,139,445,68,67,3222,17500 }" > > # Log interface > set loginterface $ext_if > > # Set limits > set limit { states 40000, frags 20000, src-nodes 20000 } > > scrub on $ext_if all fragment reassemble no-df random-id > > # ---- Nat jails to the web > binat on $ext_if from 10.233.185.15/32 to !10.233.185.0/24 -> 87.233.191.156/32 # saltmaste > binat on $ext_if from 10.233.185.20/32 to !10.233.185.0/24 -> 87.233.191.155/32 # haproxy > binat on $ext_if from 10.233.185.22/32 to !10.233.185.0/24 -> 87.233.191.154/32 # web-comb > > nat on $ext_if from <jailnetworks> to any -> ($ext_if:0) > > # ---- First rule obligatory "Pass all on loopback" > pass quick on lo0 all > pass quick on bridge0 all > pass quick on bridge1 all > > # ---- Block TOR exit addresses > block quick proto { tcp, udp } from <torlist> to $ext_if > > # ---- Second rule "Block all in and pass all out" > block in log all > pass out all keep state > > # IPv6 pass in/out all IPv6 ICMP traffic > pass in quick proto icmp6 all > > # Pass all lo0 > set skip on lo0 > > ############### FIREWALL ############################################### > # ---- Block custom ip's and logs > block quick proto { tcp, udp } from <custom-block> to $ext_if > > # ---- Jail poorten > pass in quick on { $ext_if } proto tcp from any to 10.233.185.22 port { smtp 80 443 993 995 1956 } keep state > pass in quick on { $ext_if } proto tcp from any to 10.233.185.20 port { smtp 80 443 993 995 1956 } keep state > pass in quick on { $ext_if } proto tcp from any to 10.233.185.15 port { 4505 4506 } keep state > > # ---- Allow ICMP > pass in inet proto icmp all icmp-type $icmp_types keep state > pass out inet proto icmp all icmp-type $icmp_types keep state > > pass in quick on $ext_if inet proto tcp from any to $ext_if port { 80, 443 } flags S/SA keep state > pass in quick on $ext_if inet proto tcp from <ssh-trusted> to $ext_if port { 4505 4506 } flags S/SA keep state > block log quick from <bruteforcers> > pass quick proto tcp from <ssh-trusted> to $ext_if port ssh flags S/SA keep state > > This is as minimal i can get it. > > Hope this helps. > regards, > Johan Hendriks > > > Op za 12 mrt. 2022 om 02:10 schreef Kristof Provost <kp@freebsd.org>: >> On 11 Mar 2022, at 18:55, Michael Gmelin wrote: >> >> On 12. Mar 2022, at 01:21, Kristof Provost <kp@freebsd.org> wrote: >> >> >> >> On 11 Mar 2022, at 17:44, Johan Hendriks wrote: >> >>>> On 09/03/2022 20:55, Johan Hendriks wrote: >> >>>> The problem: >> >>>> I have a FreeBSD 14 machine and a FreeBSD 13-stable machine, both running the same jails just to test the workings. >> >>>> >> >>>> The jails that are running are a salt master, a haproxy jail, 2 webservers, 2 varnish servers, 2 php jails one for php8.0 and one with 8.1. All the jails are connected to bridge0 and all the jails use vnet. >> >>>> >> >>>> I believe this worked on an older 14-HEAD machine, but i did not do a lot with it back then, and when i started testing again and after updating the OS i noticed that one of the varnish jails lost it's network connection after running for a few hours. I thought it was just something on HEAD so never really looked at it. But later on when i start using the jails again and testing a test wordpress site i noticed that with a simple load test my haproxy jail within one minute looses it's network connection. I see nothing in the logs, on the host and on the jail. >> >>>> From the jail i can not ping the other jails or the IP adres of the bridge. I can however ping the jails own IP adres. From the host i can also not ping the haproxy jail IP adres. If i start a tcpdump on the epaira interface from the haproxy jail i do see the packets arrive but not in the jail. >> >>>> >> >>>> I used ZFS to send all the jails to a 13-STABLE machine and copied over the jail.conf file as well as the pf.conf file and i saw the same behavior. >> >>>> >> >>>> Then i tried to use 13.0-RELEASE-p7 and on that machine i do not see this happening. There i can stress test the machine for 10 minutes without a problem but on 14-HEAD and 13-STABLE within a minute the jail's network connection fails and only a restart of the jail brings it back online to exhibit the same behavior if i start a simple load test which it should handle nicely. >> >>>> >> >>>> One of the jail hosts is running under VMWARE and the other is running under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is running under Ubuntu with KVM >> >>>> >> >>>> Thank you for your time. >> >>>> regards >> >>>> Johan >> >>>> >> >>> I did some bisecting and the latest commit that works on FreeBSD 13-Stable is 009a56b2e >> >>> Then the commit 2e0bee4c7 if_epair: implement fanout and above is showing the symptoms described above. >> >>> >> >> Interestingly I cannot reproduce stalls in simple epair setups. >> >> It would be useful if you could reduce the setup with the problem into a minimal configuration so we can figure out what other factors are involved. >> > >> > If there are clear instructions on how to reproduce, I’m happy to help experimenting (I’m relying heavily on epair at this point). >> > >> > @Kristof: Did you try on bare metal or on vms? >> > >> Both. >> >> Kristof > [-- Attachment #2 --] <html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div dir="ltr"></div><div dir="ltr">I also gave it another go (this time with multiple CPUs assigned to the vm), still works just fine - so I think we would need more details about the setup.</div><div dir="ltr"><br></div><div dir="ltr">Would it make sense to share our test setups, so Johan can try to reproduce with them?</div><div dir="ltr"><br></div><div dir="ltr">-m</div><div dir="ltr"><br><blockquote type="cite">On 13. Mar 2022, at 00:48, Kristof Provost <kp@freebsd.org> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"> <meta http-equiv="Content-Type" content="text/xhtml; charset=utf-8"> <div style="font-family: sans-serif;"><div class="plaintext" style="white-space: normal;"><p dir="auto">I’m still failing to reproduce.</p> <p dir="auto">Is pf absolutely required to trigger the issue? Is haproxy (i.e. can you trigger it with iperf)? <br> Is the bridge strictly required?</p> <p dir="auto">Kristof</p> <p dir="auto">On 12 Mar 2022, at 8:18, Johan Hendriks wrote: <br> </p></div><blockquote class="embedded" style="margin: 0 0 5px; padding-left: 5px; border-left: 2px solid #136BCE; color: #136BCE;"><div id="F15475DE-793E-4A29-95C3-2EA5B501E738"> <div dir="ltr">For me this minimal setup let me see the drop off of the network from the haproxy server.<br> <br> 2 jails, one with haproxy, one with nginx which is using the following html file to be served.<br> <br> <!DOCTYPE html><br> <html><br> <head><br> <title>Page Title</title><br> </head><br> <body><br> <br> <h1>My First Heading</h1><br> <p>My first paragraph.</p><br> <br> </body><br> </html><br> <br> From a remote machine i do a hey -h2 -n 10 -c 10 -z 300s <a href="https://wp.test.nl">https://wp.test.nl</a><br> Then a ping on the jailhost to the haproxy shows the following<br> <br> [ /] > ping 10.233.185.20<br> PING 10.233.185.20 (10.233.185.20): 56 data bytes<br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=0 ttl=64 time=0.054 ms<br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=1 ttl=64 time=0.050 ms<br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=2 ttl=64 time=0.041 ms<br> <SNIP><br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=169 ttl=64 time=0.050 ms<br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=170 ttl=64 time=0.154 ms<br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=171 ttl=64 time=0.054 ms<br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=172 ttl=64 time=0.039 ms<br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=173 ttl=64 time=0.160 ms<br> 64 bytes from <a href="http://10.233.185.20">10.233.185.20</a>: icmp_seq=174 ttl=64 time=0.045 ms<br> ^C<br> --- 10.233.185.20 ping statistics ---<br> 335 packets transmitted, 175 packets received, 47.8% packet loss<br> round-trip min/avg/max/stddev = 0.037/0.070/0.251/0.040 ms<br> <br> <br> ifconfig<br> vtnet0: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500<br> options=4c00bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6><br> ether 56:16:e9:80:5e:41<br> inet 87.233.191.146 netmask 0xfffffff0 broadcast 87.233.191.159<br> inet 87.233.191.156 netmask 0xffffffff broadcast 87.233.191.156<br> inet 87.233.191.155 netmask 0xffffffff broadcast 87.233.191.155<br> inet 87.233.191.154 netmask 0xffffffff broadcast 87.233.191.154<br> media: Ethernet autoselect (10Gbase-T <full-duplex>)<br> status: active<br> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL><br> vtnet1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500<br> options=4c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6><br> ether 56:16:2c:64:32:35<br> media: Ethernet autoselect (10Gbase-T <full-duplex>)<br> status: active<br> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL><br> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384<br> options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6><br> inet6 ::1 prefixlen 128<br> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3<br> inet 127.0.0.1 netmask 0xff000000<br> groups: lo<br> nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL><br> bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500<br> ether 58:9c:fc:10:ff:82<br> inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255<br> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15<br> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200<br> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0<br> member: epair20a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP><br> ifmaxaddr 0 port 7 priority 128 path cost 2000<br> member: epair18a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP><br> ifmaxaddr 0 port 15 priority 128 path cost 2000<br> groups: bridge<br> nd6 options=9<PERFORMNUD,IFDISABLED><br> bridge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500<br> ether 58:9c:fc:10:d9:1a<br> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15<br> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200<br> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0<br> member: vtnet0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP><br> ifmaxaddr 0 port 1 priority 128 path cost 2000<br> groups: bridge<br> nd6 options=9<PERFORMNUD,IFDISABLED><br> pflog0: flags=141<UP,RUNNING,PROMISC> metric 0 mtu 33160<br> groups: pflog<br> epair18a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500<br> description: jail_web01<br> options=8<VLAN_MTU><br> ether 02:77:ea:19:c7:0a<br> groups: epair<br> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)<br> status: active<br> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL><br> epair20a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500<br> description: jail_haproxy<br> options=8<VLAN_MTU><br> ether 02:9b:93:8c:59:0a<br> groups: epair<br> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)<br> status: active<br> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL><br> <br> jail.conf<br> <br> # Global settings applied to all jails.<br> $domain = "<a href="http://test.nl">test.nl</a>";<br> <br> exec.start = "/bin/sh /etc/rc";<br> exec.stop = "/bin/sh /etc/rc.shutdown";<br> exec.clean;<br> <br> mount.fstab = "/storage/jails/$name.fstab";<br> <br> exec.system_user = "root";<br> exec.jail_user = "root";<br> mount.devfs;<br> sysvshm="new";<br> sysvsem="new";<br> allow.raw_sockets;<br> allow.set_hostname = 0;<br> allow.sysvipc;<br> enforce_statfs = "2";<br> devfs_ruleset = "11";<br> <br> path = "/storage/jails/${name}";<br> host.hostname = "${name}.${domain}";<br> <br> <br> # Networking<br> vnet;<br> vnet.interface = "vnet0";<br> <br> # Commands to run on host before jail is created<br> exec.prestart = "ifconfig epair${ip} create up description jail_${name}";<br> exec.prestart += "ifconfig epair${ip}a up";<br> exec.prestart += "ifconfig bridge0 addm epair${ip}a up";<br> exec.created = "ifconfig epair${ip}b name vnet0";<br> <br> # Commands to run in jail after it is created<br> exec.start += "/bin/sh /etc/rc";<br> <br> # commands to run in jail when jail is stopped<br> exec.stop = "/bin/sh /etc/rc.shutdown";<br> <br> # Commands to run on host when jail is stopped<br> exec.poststop = "ifconfig bridge0 deletem epair${ip}a";<br> exec.poststop += "ifconfig epair${ip}a destroy";<br> persist;<br> <br> web01 {<br> $ip = 18;<br> }<br> <br> haproxy {<br> $ip = 20;<br> mount.fstab = "";<br> path = "/storage/jails/${name}";<br> }<br> <br> pf.conf<br> <br> #######################################################################<br> ext_if="vtnet0"<br> table <bruteforcers> persist<br> table <torlist> persist<br> table <ssh-trusted> persist file "/usr/local/etc/pf/ssh-trusted"<br> table <custom-block> persist file "/usr/local/etc/pf/custom-block"<br> table <jailnetworks> { <a href="http://10.233.185.0/24">10.233.185.0/24</a>, <a href="http://192.168.10.0/24">192.168.10.0/24</a> }<br> <br> icmp_types = "echoreq"<br> junk_ports="{ 135,137,138,139,445,68,67,3222,17500 }"<br> <br> # Log interface<br> set loginterface $ext_if<br> <br> # Set limits<br> set limit { states 40000, frags 20000, src-nodes 20000 }<br> <br> scrub on $ext_if all fragment reassemble no-df random-id<br> <br> # ---- Nat jails to the web<br> binat on $ext_if from <a href="http://10.233.185.15/32">10.233.185.15/32</a> to !<a href="http://10.233.185.0/24">10.233.185.0/24</a> -> <a href="http://87.233.191.156/32">87.233.191.156/32</a> # saltmaste<br> binat on $ext_if from <a href="http://10.233.185.20/32">10.233.185.20/32</a> to !<a href="http://10.233.185.0/24">10.233.185.0/24</a> -> <a href="http://87.233.191.155/32">87.233.191.155/32</a> # haproxy<br> binat on $ext_if from <a href="http://10.233.185.22/32">10.233.185.22/32</a> to !<a href="http://10.233.185.0/24">10.233.185.0/24</a> -> <a href="http://87.233.191.154/32">87.233.191.154/32</a> # web-comb<br> <br> nat on $ext_if from <jailnetworks> to any -> ($ext_if:0)<br> <br> # ---- First rule obligatory "Pass all on loopback"<br> pass quick on lo0 all<br> pass quick on bridge0 all<br> pass quick on bridge1 all<br> <br> # ---- Block TOR exit addresses<br> block quick proto { tcp, udp } from <torlist> to $ext_if<br> <br> # ---- Second rule "Block all in and pass all out"<br> block in log all<br> pass out all keep state<br> <br> # IPv6 pass in/out all IPv6 ICMP traffic<br> pass in quick proto icmp6 all<br> <br> # Pass all lo0<br> set skip on lo0<br> <br> ############### FIREWALL ###############################################<br> # ---- Block custom ip's and logs<br> block quick proto { tcp, udp } from <custom-block> to $ext_if<br> <br> # ---- Jail poorten<br> pass in quick on { $ext_if } proto tcp from any to 10.233.185.22 port { smtp 80 443 993 995 1956 } keep state<br> pass in quick on { $ext_if } proto tcp from any to 10.233.185.20 port { smtp 80 443 993 995 1956 } keep state<br> pass in quick on { $ext_if } proto tcp from any to 10.233.185.15 port { 4505 4506 } keep state<br> <br> # ---- Allow ICMP<br> pass in inet proto icmp all icmp-type $icmp_types keep state<br> pass out inet proto icmp all icmp-type $icmp_types keep state<br> <br> pass in quick on $ext_if inet proto tcp from any to $ext_if port { 80, 443 } flags S/SA keep state<br> pass in quick on $ext_if inet proto tcp from <ssh-trusted> to $ext_if port { 4505 4506 } flags S/SA keep state<br> block log quick from <bruteforcers><br> pass quick proto tcp from <ssh-trusted> to $ext_if port ssh flags S/SA keep state<br> <br> This is as minimal i can get it.<br> <br> Hope this helps.<br> regards,<br> Johan Hendriks<br> <br></div> <br> <div class="gmail_quote"> <div dir="ltr" class="gmail_attr">Op za 12 mrt. 2022 om 02:10 schreef Kristof Provost <<a href="mailto:kp@freebsd.org">kp@freebsd.org</a>>:<br></div> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 11 Mar 2022, at 18:55, Michael Gmelin wrote:<br> >> On 12. Mar 2022, at 01:21, Kristof Provost <<a href="mailto:kp@freebsd.org" target="_blank">kp@freebsd.org</a>> wrote:<br> >><br> >> On 11 Mar 2022, at 17:44, Johan Hendriks wrote:<br> >>>> On 09/03/2022 20:55, Johan Hendriks wrote:<br> >>>> The problem:<br> >>>> I have a FreeBSD 14 machine and a FreeBSD 13-stable machine, both running the same jails just to test the workings.<br> >>>><br> >>>> The jails that are running are a salt master, a haproxy jail, 2 webservers, 2 varnish servers, 2 php jails one for php8.0 and one with 8.1. All the jails are connected to bridge0 and all the jails use vnet.<br> >>>><br> >>>> I believe this worked on an older 14-HEAD machine, but i did not do a lot with it back then, and when i started testing again and after updating the OS i noticed that one of the varnish jails lost it's network connection after running for a few hours. I thought it was just something on HEAD so never really looked at it. But later on when i start using the jails again and testing a test wordpress site i noticed that with a simple load test my haproxy jail within one minute looses it's network connection. I see nothing in the logs, on the host and on the jail.<br> >>>> From the jail i can not ping the other jails or the IP adres of the bridge. I can however ping the jails own IP adres. From the host i can also not ping the haproxy jail IP adres. If i start a tcpdump on the epaira interface from the haproxy jail i do see the packets arrive but not in the jail.<br> >>>><br> >>>> I used ZFS to send all the jails to a 13-STABLE machine and copied over the jail.conf file as well as the pf.conf file and i saw the same behavior.<br> >>>><br> >>>> Then i tried to use 13.0-RELEASE-p7 and on that machine i do not see this happening. There i can stress test the machine for 10 minutes without a problem but on 14-HEAD and 13-STABLE within a minute the jail's network connection fails and only a restart of the jail brings it back online to exhibit the same behavior if i start a simple load test which it should handle nicely.<br> >>>><br> >>>> One of the jail hosts is running under VMWARE and the other is running under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is running under Ubuntu with KVM<br> >>>><br> >>>> Thank you for your time.<br> >>>> regards<br> >>>> Johan<br> >>>><br> >>> I did some bisecting and the latest commit that works on FreeBSD 13-Stable is 009a56b2e<br> >>> Then the commit 2e0bee4c7 if_epair: implement fanout and above is showing the symptoms described above.<br> >>><br> >> Interestingly I cannot reproduce stalls in simple epair setups.<br> >> It would be useful if you could reduce the setup with the problem into a minimal configuration so we can figure out what other factors are involved.<br> ><br> > If there are clear instructions on how to reproduce, I’m happy to help experimenting (I’m relying heavily on epair at this point).<br> ><br> > @Kristof: Did you try on bare metal or on vms?<br> ><br> Both.<br> <br> Kristof<br></blockquote> </div></div></blockquote> <div class="plaintext" style="white-space: normal;"> </div> </div> </div></blockquote></body></html>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?95793CDF-6E72-4FAB-8BF5-F2E67D3F69CD>
