Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Mar 2022 01:40:39 +0100
From:      Johan Hendriks <joh.hendriks@gmail.com>
To:        freebsd-net@freebsd.org
Subject:   Re: epair and vnet jail loose connection.
Message-ID:  <CAOaKuAURAMmT5=grPsavVUeeBa3sJPrr884i3BFVKBjZawFcLQ@mail.gmail.com>
In-Reply-To: <051d51b6-2a07-fbc6-7b4d-13947e7fcdbb@gmail.com>
References:  <051d51b6-2a07-fbc6-7b4d-13947e7fcdbb@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000005a1b5905d9d277b2
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

I remembered that it used to work, so i thought lets go back in time.
So i did a git reset --hard 375fdb6e161ea78a957314efeecd5ee0654a2793 which
is a commit from january the first of 2022.

[root]@[jhost001] -
[ ~ ] > uname -a
FreeBSD jhost001 13.0-STABLE FreeBSD 13.0-STABLE #0
stable/13-n248793-375fdb6e161: Thu Mar 10 00:11:19 CET 2022
root@jhost001:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

With this version i do not see the jails go down, so it is something that
has been done after 01-01-2022
I will try to rebuild it a couple more times and see when it breaks.



Op wo 9 mrt. 2022 om 20:55 schreef Johan Hendriks <joh.hendriks@gmail.com>:

> The problem:
> I have a FreeBSD 14 machine and a FreeBSD 13-stable machine, both running=
 the same jails just to test the workings.
>
> The jails that are running are a salt master, a haproxy  jail, 2 webserve=
rs, 2 varnish servers, 2 php jails one for php8.0 and one with 8.1. All the=
 jails are connected to bridge0 and all the jails use vnet.
>
> I believe this worked on an older 14-HEAD machine, but i did not do a lot=
 with it back then, and when i started testing again and after updating the=
 OS i noticed that one of the varnish jails lost it's network connection af=
ter running for a few hours. I thought it was just something on HEAD so nev=
er really looked at it. But later on when i start using the jails again and=
 testing a test wordpress site i noticed that with a simple load test my ha=
proxy jail within one minute looses it's network connection. I see nothing =
in the logs, on the host and on the jail.
> From the jail i can not ping the other jails or the IP adres of the bridg=
e. I can however ping the jails own IP adres. From the host i can also not =
ping the haproxy jail IP adres. If i start a tcpdump on the epaira interfac=
e from the haproxy jail i do see the packets arrive but not in the jail.
>
> I used ZFS to send all the jails to a 13-STABLE machine and copied over t=
he jail.conf file as well as the pf.conf file and i saw the same behavior.
>
> Then i tried to use 13.0-RELEASE-p7 and on that machine i do not see this=
 happening. There i can stress test the machine for 10 minutes without a pr=
oblem but on 14-HEAD and 13-STABLE within a minute the jail's network conne=
ction fails and only a restart of the jail brings it back online to exhibit=
 the same behavior if i start a simple load test which it should handle nic=
ely.
>
> One of the jail hosts is running under VMWARE and the other is running un=
der Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is running under Ubuntu =
with KVM
>
> Thank you for your time.
> regards
> Johan
>
>

--0000000000005a1b5905d9d277b2
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I remembered that it used to work, so i thought lets go ba=
ck in time.<br>So i did a=C2=A0git reset --hard 375fdb6e161ea78a957314efeec=
d5ee0654a2793 which is a commit from january the first of 2022.=C2=A0<br><b=
r>[root]@[jhost001] -<br>[ ~ ] &gt; uname -a<br>FreeBSD jhost001 13.0-STABL=
E FreeBSD 13.0-STABLE #0 stable/13-n248793-375fdb6e161: Thu Mar 10 00:11:19=
 CET 2022 =C2=A0 =C2=A0 root@jhost001:/usr/obj/usr/src/amd64.amd64/sys/GENE=
RIC =C2=A0amd64<br><br>With this version i do not see the jails go down, so=
 it is something that has been done after 01-01-2022<br>I will try to rebui=
ld it a couple more times and see when it breaks.<br><br><br></div><br><div=
 class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">Op wo 9 mrt. 2=
022 om 20:55 schreef Johan Hendriks &lt;<a href=3D"mailto:joh.hendriks@gmai=
l.com">joh.hendriks@gmail.com</a>&gt;:<br></div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex">
 =20

   =20
 =20
  <div>
    <pre>The problem:
I have a FreeBSD 14 machine and a FreeBSD 13-stable machine, both running t=
he same jails just to test the workings.

The jails that are running are a salt master, a haproxy  jail, 2 webservers=
, 2 varnish servers, 2 php jails one for php8.0 and one with 8.1. All the j=
ails are connected to bridge0 and all the jails use vnet.

I believe this worked on an older 14-HEAD machine, but i did not do a lot w=
ith it back then, and when i started testing again and after updating the O=
S i noticed that one of the varnish jails lost it&#39;s network connection =
after running for a few hours. I thought it was just something on HEAD so n=
ever really looked at it. But later on when i start using the jails again a=
nd testing a test wordpress site i noticed that with a simple load test my =
haproxy jail within one minute looses it&#39;s network connection. I see no=
thing in the logs, on the host and on the jail.
>From the jail i can not ping the other jails or the IP adres of the bridge.=
 I can however ping the jails own IP adres. From the host i can also not pi=
ng the haproxy jail IP adres. If i start a tcpdump on the epaira interface =
from the haproxy jail i do see the packets arrive but not in the jail.

I used ZFS to send all the jails to a 13-STABLE machine and copied over the=
 jail.conf file as well as the pf.conf file and i saw the same behavior.

Then i tried to use 13.0-RELEASE-p7 and on that machine i do not see this h=
appening. There i can stress test the machine for 10 minutes without a prob=
lem but on 14-HEAD and 13-STABLE within a minute the jail&#39;s network con=
nection fails and only a restart of the jail brings it back online to exhib=
it the same behavior if i start a simple load test which it should handle n=
icely.

One of the jail hosts is running under VMWARE and the other is running unde=
r Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is running under Ubuntu wi=
th KVM

Thank you for your time.
regards
Johan</pre>
  </div>

</blockquote></div>

--0000000000005a1b5905d9d277b2--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOaKuAURAMmT5=grPsavVUeeBa3sJPrr884i3BFVKBjZawFcLQ>