From nobody Sat Mar 12 23:47:15 2022 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4631E1A01B5A for ; Sat, 12 Mar 2022 23:47:20 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KGKGc1NSkz4Ydy; Sat, 12 Mar 2022 23:47:20 +0000 (UTC) (envelope-from kp@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1647128840; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DFlDWxHMwXZGbo+sklDl42AoEroNtsGQNhC4rf8DBAk=; b=ntUSAELYMgq02QKPCJUw2qUVzaTFdA0ZrQcI9GFnkwY9NQhgUtYA0mjoKV2lGUEj+hwMhD U6qx4pXA1Kajg4+JcaIsKGQts4ijydeKmuo8pIc3WFbxtBGrETtRfVY3rOK3JT/AIAoqa8 GsgGHeTllsAPg+EKeg+3h/+2D6IgroBw4S2FrYEMTLLy3kJ2mNp2YjoFtljzbZjL60oTVJ r1FetndiRS23BATiXhXMhDWvlIelJWrEPo16KC2aC4ajaDFhb/ceQS4DrNmxedBUVyyn1c 0Ad+vO8wRvTAVA7Hoq5+fCgBA0JZVT2fgv8TIkl4KWVLuMzM8pzRE+gJlmBCoA== Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.codepro.be", Issuer "R3" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id B6E0EE47D; Sat, 12 Mar 2022 23:47:19 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id 5FEEC1A59B; Sun, 13 Mar 2022 00:47:17 +0100 (CET) From: Kristof Provost To: Johan Hendriks Cc: Michael Gmelin , freebsd-net@freebsd.org, ">> \\\\\\\\Patrick M. Hausen\\\\" Subject: Re: epair and vnet jail loose connection. Date: Sat, 12 Mar 2022 17:47:15 -0600 X-Mailer: MailMate (1.14r5852) Message-ID: <94B8885D-F63F-40C3-9E7E-158CC252FF9A@FreeBSD.org> In-Reply-To: References: <41ED1534-5E98-4D46-A562-811E80F82C5F@FreeBSD.org> <43AA6B37-6235-4787-A03F-B4C264C75A58@freebsd.org> List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_MailMate_A4C45A6B-01D8-4051-9E0E-3AED5196B518_=" Embedded-HTML: [{"plain":[222,11484],"uuid":"F15475DE-793E-4A29-95C3-2EA5B501E738"}] ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1647128840; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DFlDWxHMwXZGbo+sklDl42AoEroNtsGQNhC4rf8DBAk=; b=LdObyd1JjVKxLuvxmR9izfXWBpcvjP/macr6/TMgFMwJgkKlzbF1NYqhGvmNFrvZKsqSs7 oy9Zv/mdXcigcKMMOd1eygZhpN5IRZldJwkgfgd9SipCs/E/qN8Ar+MrL468SHRsAkpqDV Ko7DIdhfd7dtTHEzi5SZKbVXaMT3PThb4e8/oAj4FZODRAz5p3FuQUXWn9HNGW2S/XC2MX cG6pcbH1QyayMEkEsw4pmvdvkqmbfQSTU1kjNHN40hmyZoGNiJDpRj/91qi0DsSrwGQzUJ 6ahRsmz4z28xSCj7HJbqr5jOefdmg+/3JQGL/aiRmnFSW0XXn0gquOJy0dtaaQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1647128840; a=rsa-sha256; cv=none; b=vK6OOEPN2Ta0ZsZJ70xOfn/5K7Bz2Zw0jYdYwHsLak2r+HqYDhZS01+gIY18xel6C3LoYD /pZAC3Iq2DQSPgnQeOY8a06ANlV0YO/mLmx1p/jkKqVMjXfU7fFVMO72hXjSkYVwIT56Jj wPS5YfBv2zN1+DEONymKBetyYoZdCYmo5EOv7lq1wB+Uc6n1/yeWoFxYcslMmrDqZAY2+C h2qd5tyniVqxc2Q9BwMbSK/4FFiW/ttqiYfKo8W/RDBFJslTN7/Zq6gmkJmkPhX6asMV5J XvMXNMlEld4jhig/AF7wuDhyTpStgWObts/9h7paGZC3MrDzY/0Bo394Et5yRg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N --=_MailMate_A4C45A6B-01D8-4051-9E0E-3AED5196B518_= Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable I=E2=80=99m still failing to reproduce. Is pf absolutely required to trigger the issue? Is haproxy (i.e. can you = trigger it with iperf)? Is the bridge strictly required? Kristof On 12 Mar 2022, at 8:18, Johan Hendriks wrote: > For me this minimal setup let me see the drop off of the network from = > the > haproxy server. > > 2 jails, one with haproxy, one with nginx which is using the following = > html > file to be served. > > > > > Page Title > > > >

My First Heading

>

My first paragraph.

> > > > >> From a remote machine i do a hey -h2 -n 10 -c 10 -z 300s = >> https://wp.test.nl > Then a ping on the jailhost to the haproxy shows the following > > [ /] > ping 10.233.185.20 > PING 10.233.185.20 (10.233.185.20): 56 data bytes > 64 bytes from 10.233.185.20: icmp_seq=3D0 ttl=3D64 time=3D0.054 ms > 64 bytes from 10.233.185.20: icmp_seq=3D1 ttl=3D64 time=3D0.050 ms > 64 bytes from 10.233.185.20: icmp_seq=3D2 ttl=3D64 time=3D0.041 ms > > 64 bytes from 10.233.185.20: icmp_seq=3D169 ttl=3D64 time=3D0.050 ms > 64 bytes from 10.233.185.20: icmp_seq=3D170 ttl=3D64 time=3D0.154 ms > 64 bytes from 10.233.185.20: icmp_seq=3D171 ttl=3D64 time=3D0.054 ms > 64 bytes from 10.233.185.20: icmp_seq=3D172 ttl=3D64 time=3D0.039 ms > 64 bytes from 10.233.185.20: icmp_seq=3D173 ttl=3D64 time=3D0.160 ms > 64 bytes from 10.233.185.20: icmp_seq=3D174 ttl=3D64 time=3D0.045 ms > ^C > --- 10.233.185.20 ping statistics --- > 335 packets transmitted, 175 packets received, 47.8% packet loss > round-trip min/avg/max/stddev =3D 0.037/0.070/0.251/0.040 ms > > > ifconfig > vtnet0: flags=3D8963 = > metric 0 > mtu 1500 > options=3D4c00bb > ether 56:16:e9:80:5e:41 > inet 87.233.191.146 netmask 0xfffffff0 broadcast 87.233.191.159 > inet 87.233.191.156 netmask 0xffffffff broadcast 87.233.191.156 > inet 87.233.191.155 netmask 0xffffffff broadcast 87.233.191.155 > inet 87.233.191.154 netmask 0xffffffff broadcast 87.233.191.154 > media: Ethernet autoselect (10Gbase-T ) > status: active > nd6 options=3D29 > vtnet1: flags=3D8863 metric 0 = > mtu 1500 > options=3D4c07bb > ether 56:16:2c:64:32:35 > media: Ethernet autoselect (10Gbase-T ) > status: active > nd6 options=3D29 > lo0: flags=3D8049 metric 0 mtu 16384 > options=3D680003 > inet6 ::1 prefixlen 128 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 > inet 127.0.0.1 netmask 0xff000000 > groups: lo > nd6 options=3D21 > bridge0: flags=3D8843 metric 0 = > mtu > 1500 > ether 58:9c:fc:10:ff:82 > inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255 > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > member: epair20a flags=3D143 > ifmaxaddr 0 port 7 priority 128 path cost 2000 > member: epair18a flags=3D143 > ifmaxaddr 0 port 15 priority 128 path cost 2000 > groups: bridge > nd6 options=3D9 > bridge1: flags=3D8843 metric 0 = > mtu > 1500 > ether 58:9c:fc:10:d9:1a > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > member: vtnet0 flags=3D143 > ifmaxaddr 0 port 1 priority 128 path cost 2000 > groups: bridge > nd6 options=3D9 > pflog0: flags=3D141 metric 0 mtu 33160 > groups: pflog > epair18a: flags=3D8963 = > metric > 0 mtu 1500 > description: jail_web01 > options=3D8 > ether 02:77:ea:19:c7:0a > groups: epair > media: Ethernet 10Gbase-T (10Gbase-T ) > status: active > nd6 options=3D29 > epair20a: flags=3D8963 = > metric > 0 mtu 1500 > description: jail_haproxy > options=3D8 > ether 02:9b:93:8c:59:0a > groups: epair > media: Ethernet 10Gbase-T (10Gbase-T ) > status: active > nd6 options=3D29 > > jail.conf > > # Global settings applied to all jails. > $domain =3D "test.nl"; > > exec.start =3D "/bin/sh /etc/rc"; > exec.stop =3D "/bin/sh /etc/rc.shutdown"; > exec.clean; > > mount.fstab =3D "/storage/jails/$name.fstab"; > > exec.system_user =3D "root"; > exec.jail_user =3D "root"; > mount.devfs; > sysvshm=3D"new"; > sysvsem=3D"new"; > allow.raw_sockets; > allow.set_hostname =3D 0; > allow.sysvipc; > enforce_statfs =3D "2"; > devfs_ruleset =3D "11"; > > path =3D "/storage/jails/${name}"; > host.hostname =3D "${name}.${domain}"; > > > # Networking > vnet; > vnet.interface =3D "vnet0"; > > # Commands to run on host before jail is created > exec.prestart =3D "ifconfig epair${ip} create up description = > jail_${name}"; > exec.prestart +=3D "ifconfig epair${ip}a up"; > exec.prestart +=3D "ifconfig bridge0 addm epair${ip}a up"; > exec.created =3D "ifconfig epair${ip}b name vnet0"; > > # Commands to run in jail after it is created > exec.start +=3D "/bin/sh /etc/rc"; > > # commands to run in jail when jail is stopped > exec.stop =3D "/bin/sh /etc/rc.shutdown"; > > # Commands to run on host when jail is stopped > exec.poststop =3D "ifconfig bridge0 deletem epair${ip}a"; > exec.poststop +=3D "ifconfig epair${ip}a destroy"; > persist; > > web01 { > $ip =3D 18; > } > > haproxy { > $ip =3D 20; > mount.fstab =3D ""; > path =3D "/storage/jails/${name}"; > } > > pf.conf > > #######################################################################= > ext_if=3D"vtnet0" > table persist > table persist > table persist file "/usr/local/etc/pf/ssh-trusted" > table persist file "/usr/local/etc/pf/custom-block" > table { 10.233.185.0/24, 192.168.10.0/24 } > > icmp_types =3D "echoreq" > junk_ports=3D"{ 135,137,138,139,445,68,67,3222,17500 }" > > # Log interface > set loginterface $ext_if > > # Set limits > set limit { states 40000, frags 20000, src-nodes 20000 } > > scrub on $ext_if all fragment reassemble no-df random-id > > # ---- Nat jails to the web > binat on $ext_if from 10.233.185.15/32 to !10.233.185.0/24 -> > 87.233.191.156/32 # saltmaste > binat on $ext_if from 10.233.185.20/32 to !10.233.185.0/24 -> > 87.233.191.155/32 # haproxy > binat on $ext_if from 10.233.185.22/32 to !10.233.185.0/24 -> > 87.233.191.154/32 # web-comb > > nat on $ext_if from to any -> ($ext_if:0) > > # ---- First rule obligatory "Pass all on loopback" > pass quick on lo0 all > pass quick on bridge0 all > pass quick on bridge1 all > > # ---- Block TOR exit addresses > block quick proto { tcp, udp } from to $ext_if > > # ---- Second rule "Block all in and pass all out" > block in log all > pass out all keep state > > # IPv6 pass in/out all IPv6 ICMP traffic > pass in quick proto icmp6 all > > # Pass all lo0 > set skip on lo0 > > ############### FIREWALL = > ############################################### > # ---- Block custom ip's and logs > block quick proto { tcp, udp } from to $ext_if > > # ---- Jail poorten > pass in quick on { $ext_if } proto tcp from any to 10.233.185.22 port = > { > smtp 80 443 993 995 1956 } keep state > pass in quick on { $ext_if } proto tcp from any to 10.233.185.20 port = > { > smtp 80 443 993 995 1956 } keep state > pass in quick on { $ext_if } proto tcp from any to 10.233.185.15 port = > { > 4505 4506 } keep state > > # ---- Allow ICMP > pass in inet proto icmp all icmp-type $icmp_types keep state > pass out inet proto icmp all icmp-type $icmp_types keep state > > pass in quick on $ext_if inet proto tcp from any to $ext_if port { 80, = > 443 > } flags S/SA keep state > pass in quick on $ext_if inet proto tcp from to $ext_if = > port > { 4505 4506 } flags S/SA keep state > block log quick from > pass quick proto tcp from to $ext_if port ssh flags S/SA = > keep > state > > This is as minimal i can get it. > > Hope this helps. > regards, > Johan Hendriks > > > Op za 12 mrt. 2022 om 02:10 schreef Kristof Provost : > >> On 11 Mar 2022, at 18:55, Michael Gmelin wrote: >>>> On 12. Mar 2022, at 01:21, Kristof Provost wrote: >>>> >>>> =EF=BB=BFOn 11 Mar 2022, at 17:44, Johan Hendriks wrote: >>>>>> On 09/03/2022 20:55, Johan Hendriks wrote: >>>>>> The problem: >>>>>> I have a FreeBSD 14 machine and a FreeBSD 13-stable machine, both >> running the same jails just to test the workings. >>>>>> >>>>>> The jails that are running are a salt master, a haproxy jail, 2 >> webservers, 2 varnish servers, 2 php jails one for php8.0 and one = >> with 8.1. >> All the jails are connected to bridge0 and all the jails use vnet. >>>>>> >>>>>> I believe this worked on an older 14-HEAD machine, but i did not = >>>>>> do a >> lot with it back then, and when i started testing again and after = >> updating >> the OS i noticed that one of the varnish jails lost it's network = >> connection >> after running for a few hours. I thought it was just something on = >> HEAD so >> never really looked at it. But later on when i start using the jails = >> again >> and testing a test wordpress site i noticed that with a simple load = >> test my >> haproxy jail within one minute looses it's network connection. I see >> nothing in the logs, on the host and on the jail. >>>>>> From the jail i can not ping the other jails or the IP adres of = >>>>>> the >> bridge. I can however ping the jails own IP adres. From the host i = >> can also >> not ping the haproxy jail IP adres. If i start a tcpdump on the = >> epaira >> interface from the haproxy jail i do see the packets arrive but not = >> in the >> jail. >>>>>> >>>>>> I used ZFS to send all the jails to a 13-STABLE machine and = >>>>>> copied >> over the jail.conf file as well as the pf.conf file and i saw the = >> same >> behavior. >>>>>> >>>>>> Then i tried to use 13.0-RELEASE-p7 and on that machine i do not = >>>>>> see >> this happening. There i can stress test the machine for 10 minutes = >> without >> a problem but on 14-HEAD and 13-STABLE within a minute the jail's = >> network >> connection fails and only a restart of the jail brings it back online = >> to >> exhibit the same behavior if i start a simple load test which it = >> should >> handle nicely. >>>>>> >>>>>> One of the jail hosts is running under VMWARE and the other is >> running under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is = >> running >> under Ubuntu with KVM >>>>>> >>>>>> Thank you for your time. >>>>>> regards >>>>>> Johan >>>>>> >>>>> I did some bisecting and the latest commit that works on FreeBSD >> 13-Stable is 009a56b2e >>>>> Then the commit 2e0bee4c7 if_epair: implement fanout and above is >> showing the symptoms described above. >>>>> >>>> Interestingly I cannot reproduce stalls in simple epair setups. >>>> It would be useful if you could reduce the setup with the problem = >>>> into >> a minimal configuration so we can figure out what other factors are >> involved. >>> >>> If there are clear instructions on how to reproduce, I=E2=80=99m happ= y to = >>> help >> experimenting (I=E2=80=99m relying heavily on epair at this point). >>> >>> @Kristof: Did you try on bare metal or on vms? >>> >> Both. >> >> Kristof >> --=_MailMate_A4C45A6B-01D8-4051-9E0E-3AED5196B518_= Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

I=E2=80=99m still failing to= reproduce.

Is pf absolutely required to trigger the issue? Is haprox= y (i.e. can you trigger it with iperf)?
Is the bridge strictly required?

Kristof

On 12 Mar 2022, at 8:18, Johan Hendriks wrote:

For me this minimal setup let me see the drop off of the= network from the haproxy server.

2 jails, one with haproxy, one with nginx which is using the following ht= ml file to be served.

<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>

<h1>My First Heading</h1>
<p>My first paragraph.</p>

</body>
</html>

=46rom a remote machine i do a  hey -h2 -n 10 -c 10 -z 300s https://wp.test.nl
Then a ping on the jailhost to the haproxy shows the following

[ /] > ping 10.233.185.20
PING 10.233.185.20 (10.233.185.20): 56 data bytes
64 bytes from 10.233.185.20: icmp_se= q=3D0 ttl=3D64 time=3D0.054 ms
64 bytes from 10.233.185.20: icmp_se= q=3D1 ttl=3D64 time=3D0.050 ms
64 bytes from 10.233.185.20: icmp_se= q=3D2 ttl=3D64 time=3D0.041 ms
<SNIP>
64 bytes from 10.233.185.20: icmp_se= q=3D169 ttl=3D64 time=3D0.050 ms
64 bytes from 10.233.185.20: icmp_se= q=3D170 ttl=3D64 time=3D0.154 ms
64 bytes from 10.233.185.20: icmp_se= q=3D171 ttl=3D64 time=3D0.054 ms
64 bytes from 10.233.185.20: icmp_se= q=3D172 ttl=3D64 time=3D0.039 ms
64 bytes from 10.233.185.20: icmp_se= q=3D173 ttl=3D64 time=3D0.160 ms
64 bytes from 10.233.185.20: icmp_se= q=3D174 ttl=3D64 time=3D0.045 ms
^C
--- 10.233.185.20 ping statistics ---
335 packets transmitted, 175 packets received, 47.8% packet loss
round-trip min/avg/max/stddev =3D 0.037/0.070/0.251/0.040 ms


ifconfig
vtnet0: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>= ; metric 0 mtu 1500
options=3D4c00bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_= HWCSUM,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6>
ether 56:16:e9:80:5e:41
inet 87.233.191.146 netmask 0xfffffff0 broadcast 87.233.191.159
inet 87.233.191.156 netmask 0xffffffff broadcast 87.233.191.156
inet 87.233.191.155 netmask 0xffffffff broadcast 87.233.191.155
inet 87.233.191.154 netmask 0xffffffff broadcast 87.233.191.154
media: Ethernet autoselect (10Gbase-T <full-duplex>)
status: active
nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vtnet1: flags=3D8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric= 0 mtu 1500
options=3D4c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_= HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6>
ether 56:16:2c:64:32:35
media: Ethernet autoselect (10Gbase-T <full-duplex>)
status: active
nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384=
options=3D680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
inet 127.0.0.1 netmask 0xff000000
groups: lo
nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
bridge0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metri= c 0 mtu 1500
ether 58:9c:fc:10:ff:82
inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: epair20a flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>        ifmaxaddr 0 port 7 priority 128 path cost 2000=
member: epair18a flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>        ifmaxaddr 0 port 15 priority 128 path cost 200= 0
groups: bridge
nd6 options=3D9<PERFORMNUD,IFDISABLED>
bridge1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metri= c 0 mtu 1500
ether 58:9c:fc:10:d9:1a
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: vtnet0 flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
       ifmaxaddr 0 port 1 priority 128 path cost 2000=
groups: bridge
nd6 options=3D9<PERFORMNUD,IFDISABLED>
pflog0: flags=3D141<UP,RUNNING,PROMISC> metric 0 mtu 33160
groups: pflog
epair18a: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST&= gt; metric 0 mtu 1500
description: jail_web01
options=3D8<VLAN_MTU>
ether 02:77:ea:19:c7:0a
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
status: active
nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
epair20a: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST&= gt; metric 0 mtu 1500
description: jail_haproxy
options=3D8<VLAN_MTU>
ether 02:9b:93:8c:59:0a
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
status: active
nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

jail.conf

# Global settings applied to all jails.
$domain =3D "test.nl";

exec.start =3D "/bin/sh /etc/rc";
exec.stop =3D "/bin/sh /etc/rc.shutdown";
exec.clean;

mount.fstab =3D "/storage/jails/$name.fstab";

exec.system_user  =3D "root";
exec.jail_user    =3D "root";
mount.devfs;
sysvshm=3D"new";
sysvsem=3D"new";
allow.raw_sockets;
allow.set_hostname =3D 0;
allow.sysvipc;
enforce_statfs =3D "2";
devfs_ruleset     =3D "11";

path =3D "/storage/jails/${name}";
host.hostname =3D "${name}.${domain}";


# Networking
vnet;
vnet.interface    =3D "vnet0";

  # Commands to run on host before jail is created
  exec.prestart  =3D "ifconfig epair${ip} create up description= jail_${name}";
  exec.prestart  +=3D "ifconfig epair${ip}a up";
  exec.prestart  +=3D "ifconfig bridge0 addm epair${ip}a up";   exec.created   =3D "ifconfig epair${ip}b name vnet0";

  # Commands to run in jail after it is created
  exec.start  +=3D "/bin/sh /etc/rc";

  # commands to run in jail when jail is stopped
  exec.stop  =3D "/bin/sh /etc/rc.shutdown";

  # Commands to run on host when jail is stopped
  exec.poststop  =3D "ifconfig bridge0 deletem epair${ip}a";   exec.poststop  +=3D "ifconfig epair${ip}a destroy";
  persist;

web01 {
    $ip =3D 18;
}

haproxy {
    $ip =3D 20;
    mount.fstab =3D "";
    path =3D "/storage/jails/${name}";
}

pf.conf

####################################################################### ext_if=3D"vtnet0"
table <bruteforcers> persist
table <torlist> persist
table <ssh-trusted> persist file "/usr/local/etc/pf/ssh-trusted" table <custom-block> persist file "/usr/local/etc/pf/custom-block"<= br> table <jailnetworks> { 10.233.18= 5.0/24, 192.168.10.0/24 }

icmp_types =3D "echoreq"
junk_ports=3D"{ 135,137,138,139,445,68,67,3222,17500 }"

# Log interface
set loginterface $ext_if

# Set limits
set limit { states 40000, frags 20000, src-nodes 20000 }

scrub on $ext_if all fragment reassemble no-df random-id

# ---- Nat jails to the web
binat on $ext_if from 10.233.185.15/3= 2 to !10.233.185.0/24 -> 87.233.191.156/32 # saltmaste
binat on $ext_if from 10.233.185.20/3= 2 to !10.233.185.0/24 -> 87.233.191.155/32 # haproxy
binat on $ext_if from 10.233.185.22/3= 2 to !10.233.185.0/24 -> 87.233.191.154/32 # web-comb

nat on $ext_if from <jailnetworks> to any -> ($ext_if:0)

# ---- First rule obligatory "Pass all on loopback"
pass quick on lo0 all
pass quick on bridge0 all
pass quick on bridge1 all

# ---- Block TOR exit addresses
block quick proto { tcp, udp } from <torlist> to $ext_if

# ---- Second rule "Block all in and pass all out"
block in log all
pass out all keep state

# IPv6 pass in/out all IPv6 ICMP traffic
pass in quick proto icmp6 all

# Pass all lo0
set skip on lo0

############### FIREWALL ###############################################<= br> # ---- Block custom ip's and logs
block quick proto { tcp, udp } from <custom-block> to $ext_if

# ---- Jail poorten
pass in quick on { $ext_if } proto tcp from any to 10.233.185.22 port { s= mtp 80 443 993 995 1956 } keep state
pass in quick on { $ext_if } proto tcp from any to 10.233.185.20 port { s= mtp 80 443 993 995 1956 } keep state
pass in quick on { $ext_if } proto tcp from any to 10.233.185.15 port { 4= 505 4506 } keep state

# ---- Allow ICMP
pass in inet proto icmp all icmp-type $icmp_types keep state
pass out inet proto icmp all icmp-type $icmp_types keep state

pass in quick on $ext_if inet proto tcp from any to $ext_if port { 80, 44= 3 } flags S/SA keep state
pass in quick on $ext_if inet proto tcp from <ssh-trusted> to $ext_= if port { 4505 4506 } flags S/SA keep state
block log quick from <bruteforcers>
pass quick proto tcp from <ssh-trusted> to $ext_if port ssh flags S= /SA keep state

This is as minimal i can get it.

Hope this helps.
regards,
Johan Hendriks


Op za 12 mrt. 2022 om 02:10 schreef= Kristof Provost <kp@freebsd.org= >:
On 11 Mar 2022, at 18= :55, Michael Gmelin wrote:
>> On 12. Mar 2022, at 01:21, Kristof Provost <kp@freebsd.org> wrote:
>>
>> =EF=BB=BFOn 11 Mar 2022, at 17:44, Johan Hendriks wrote:
>>>> On 09/03/2022 20:55, Johan Hendriks wrote:
>>>> The problem:
>>>> I have a FreeBSD 14 machine and a FreeBSD 13-stable mach= ine, both running the same jails just to test the workings.
>>>>
>>>> The jails that are running are a salt master, a haproxy&= nbsp; jail, 2 webservers, 2 varnish servers, 2 php jails one for php8.0 a= nd one with 8.1. All the jails are connected to bridge0 and all the jails= use vnet.
>>>>
>>>> I believe this worked on an older 14-HEAD machine, but i= did not do a lot with it back then, and when i started testing again and= after updating the OS i noticed that one of the varnish jails lost it's = network connection after running for a few hours. I thought it was just s= omething on HEAD so never really looked at it. But later on when i start = using the jails again and testing a test wordpress site i noticed that wi= th a simple load test my haproxy jail within one minute looses it's netwo= rk connection. I see nothing in the logs, on the host and on the jail. >>>> From the jail i can not ping the other jails or the IP a= dres of the bridge. I can however ping the jails own IP adres. From the h= ost i can also not ping the haproxy jail IP adres. If i start a tcpdump o= n the epaira interface from the haproxy jail i do see the packets arrive = but not in the jail.
>>>>
>>>> I used ZFS to send all the jails to a 13-STABLE machine = and copied over the jail.conf file as well as the pf.conf file and i saw = the same behavior.
>>>>
>>>> Then i tried to use 13.0-RELEASE-p7 and on that machine = i do not see this happening. There i can stress test the machine for 10 m= inutes without a problem but on 14-HEAD and 13-STABLE within a minute the= jail's network connection fails and only a restart of the jail brings it= back online to exhibit the same behavior if i start a simple load test w= hich it should handle nicely.
>>>>
>>>> One of the jail hosts is running under VMWARE and the ot= her is running under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is ru= nning under Ubuntu with KVM
>>>>
>>>> Thank you for your time.
>>>> regards
>>>> Johan
>>>>
>>> I did some bisecting and the latest commit that works on Fre= eBSD 13-Stable is 009a56b2e
>>> Then the commit 2e0bee4c7  if_epair: implement fanout a= nd above is showing the symptoms described above.
>>>
>> Interestingly I cannot reproduce stalls in simple epair setups.<= br> >> It would be useful if you could reduce the setup with the proble= m into a minimal configuration so we can figure out what other factors ar= e involved.
>
> If there are clear instructions on how to reproduce, I=E2=80=99m hap= py to help experimenting (I=E2=80=99m relying heavily on epair at this po= int).
>
> @Kristof: Did you try on bare metal or on vms?
>
Both.

Kristof
--=_MailMate_A4C45A6B-01D8-4051-9E0E-3AED5196B518_=--