Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Dec 2022 18:30:57 +0800
From:      Zhenlei Huang <zlei.huang@gmail.com>
To:        "Bjoern A. Zeeb" <bz@FreeBSD.org>
Cc:        "freebsd-jail@freebsd.org" <freebsd-jail@FreeBSD.org>, Gleb Smirnoff <glebius@FreeBSD.org>
Subject:   Re: What's going on with vnets and epairs w/ addresses?
Message-ID:  <150A60D6-6757-46DD-988F-05A9FFA36821@FreeBSD.org>
In-Reply-To: <B6C70A88-11F8-40D7-85E4-63BBA0F7931D@FreeBSD.org>
References:  <5r22os7n-ro15-27q-r356-rps331o06so5@mnoonqbm.arg> <B6C70A88-11F8-40D7-85E4-63BBA0F7931D@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_2396E16C-866A-4D36-A1B5-6C4992A3A64A
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

Hi,

I managed to repeat this issue on CURRENT/14 with this small snip:

-------------------------------------------
#!/bin/sh

# test jail name
n=3D"test_ref_leak"

jail -c name=3D$n path=3D/ vnet persist
# The following line trigger jail pr_ref leak
jexec $n ifconfig lo0 inet 127.0.0.1/8

jail -R $n

# wait a moment
sleep 1

jls -j $n


-------------------------------------------


After DDB debugging and tracing , it seems that is triggered by a =
combine of [1] and [2]

[1] =
https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5ac895915 =
<https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5ac895915>;
[2] =
https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b175b8c5b =
<https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b175b8c5b>;


In [1] the per-VNET uma zone is shared with the global one.
`pcbinfo->ipi_zone =3D pcbstor->ips_zone;`

In [2] unref `inp->inp_cred` is deferred called in inpcb_dtor() by =
uma_zfree_smr() .

Unfortunately inps freed by uma_zfree_smr() are cached and inpcb_dtor() =
is not called immediately ,
thus leaking `inp->inp_cred` ref and hence `prison->pr_ref`.

And it is also not possible to free up the cache by per-VNET SYSUNINIT =
tcp_destroy / udp_destroy / rip_destroy.



Best regards,
Zhenlei

> On Dec 14, 2022, at 9:56 AM, Zhenlei Huang <zlei@FreeBSD.org> wrote:
>=20
>=20
> Hi,
>=20
> I also encounter this problem while testing gif tunnel between jails.
>=20
> My script is similar but with additional gif tunnels.
>=20
>=20
> There are reports in mailing list [1], [2], and another one in forum =
[3] .
>=20
> Seem to be a long standing issue.
>=20
> [1] =
https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/086126.htm=
l =
<https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/086126.ht=
ml>
> [2] =
https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003357.html =
<https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003357.html>;
> [3] =
https://forums.freebsd.org/threads/jails-stopping-prolonged-deaths-startin=
g-networking-et-cetera.84200/ =
<https://forums.freebsd.org/threads/jails-stopping-prolonged-deaths-starti=
ng-networking-et-cetera.84200/>
>=20
>=20
> Best regards,
> Zhenlei
>=20
>> On Dec 14, 2022, at 7:03 AM, Bjoern A. Zeeb <bz@FreeBSD.org =
<mailto:bz@FreeBSD.org>> wrote:
>>=20
>> Hi,
>>=20
>> I have used scripts like the below for almost a decade and a half
>> (obviously doing more than that in the middle).  I haven't used them
>> much lately but given other questions I just wanted to fire up a =
test.
>>=20
>> I have an end-November kernel doing the below my eapirs do not come =
back
>> to be destroyed (immediately).
>> I have to start polling for the jid to be no longer alive and not in
>> dying state (hence added the jls/ifconfig -l lines and removed the
>> error checking from ifconfig destroy).  That seems sometimes rather
>> unreasonably long (to the point I give up).
>>=20
>> If I don't configure the addresses below this isn't a problem.
>>=20
>> Sorry I am confused by too many incarnations of the code; I know I =
once
>> had a version with an async shutdown path but I believe that never =
made
>> it into mainline, so why are we holding onto the epairs now and not
>> nuking the addresses and returning them and are clean?
>>=20
>> It's a bit more funny; I added a twiddle loop at the end and nothing
>> happened.  So I stop the script and start it again and suddenly =
another
>> jail or two have cleaned up and their epairs are back.  Something =
feels
>> very very wonky.  Play around with this and see ... and let me know =
if
>> you can reproduce this...  I quite wonder why some test cases haven't
>> gone crazy ...
>>=20
>> /bz
>>=20
>> =
------------------------------------------------------------------------
>> #!/bin/sh
>>=20
>> set -e
>> set -x
>>=20
>> js=3D`jail -i -c -n jl host.hostname=3Dleft.example.net =
<http://left.example.net/>; vnet persist`
>> jb=3D`jail -i -c -n jr host.hostname=3Dright.example.net =
<http://right.example.net/>; vnet persist`
>>=20
>> # Create an epair connecting the two machines (vnet jails).
>> ep=3D`ifconfig epair create | sed -e 's/a$//'`
>>=20
>> # Add one end to each vnet jail.
>> ifconfig ${ep}a vnet ${js}
>> ifconfig ${ep}b vnet ${jb}
>>=20
>> # Add an IP address on the epairs in each vnet jail.
>> # XXX Leave these out and the cleanup seems to work fine.
>> jexec ${js}  ifconfig ${ep}a inet  192.0.2.1/24
>> jexec ${jb}  ifconfig ${ep}b inet  192.0.2.2/24
>>=20
>> # Clean up.
>> jail -r ${jb}
>> jail -r ${js}
>>=20
>> # You want to be able to remove this line ...
>> set +e
>>=20
>> # No epairs to destroy with addresses configured; fine otherwise.
>> ifconfig ${ep}a destroy
>> # echo $?
>>=20
>> # Add this is here only as things are funny ...
>> # jls -av jid dying
>> # ifconfig -l
>>=20
>> # end
>> =
------------------------------------------------------------------------
>>=20
>> --=20
>> Bjoern A. Zeeb                                                     =
r15:7
>>=20
>=20


--Apple-Mail=_2396E16C-866A-4D36-A1B5-6C4992A3A64A
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; =
charset=3Dus-ascii"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; line-break: after-white-space;" class=3D""><div =
class=3D"">Hi,</div><div class=3D""><br class=3D""></div><div class=3D"">I=
 managed to repeat this issue on CURRENT/14 with this small =
snip:</div><div class=3D""><br class=3D""></div><div =
class=3D"">-------------------------------------------</div><div =
class=3D""><div class=3D"">#!/bin/sh</div><div class=3D""><br =
class=3D""></div><div class=3D""># test jail name</div><div =
class=3D"">n=3D"test_ref_leak"</div><div class=3D""><br =
class=3D""></div><div class=3D"">jail -c name=3D$n path=3D/ vnet =
persist</div><div class=3D""># The following line trigger jail pr_ref =
leak</div><div class=3D"">jexec $n ifconfig lo0 inet =
127.0.0.1/8</div><div class=3D""><br class=3D""></div><div class=3D"">jail=
 -R $n</div><div class=3D""><br class=3D""></div><div class=3D""># wait =
a moment</div><div class=3D"">sleep 1</div><div class=3D""><br =
class=3D""></div><div class=3D"">jls -j $n</div></div><div class=3D""><br =
class=3D""></div><div class=3D""><br class=3D""></div><div =
class=3D"">-------------------------------------------</div><div =
class=3D""><br class=3D""></div><div class=3D""><br class=3D""></div><div =
class=3D"">After DDB debugging and tracing , it seems that is triggered =
by a combine of [1] and [2]</div><div class=3D""><br class=3D""></div><div=
 class=3D"">[1]&nbsp;<a =
href=3D"https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5ac89=
5915" =
class=3D"">https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5a=
c895915</a></div><div class=3D"">[2]&nbsp;<a =
href=3D"https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b175b=
8c5b" =
class=3D"">https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b1=
75b8c5b</a></div><div class=3D""><br class=3D""></div><div class=3D""><br =
class=3D""></div><div class=3D"">In [1] the per-VNET uma zone is shared =
with the global one.</div><div class=3D"">`pcbinfo-&gt;ipi_zone =3D =
pcbstor-&gt;ips_zone;`</div><div class=3D""><br class=3D""></div><div =
class=3D"">In [2] unref `inp-&gt;inp_cred` is deferred called in =
inpcb_dtor() by uma_zfree_smr() .</div><div class=3D""><br =
class=3D""></div><div class=3D"">Unfortunately inps freed by&nbsp;<span =
style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" =
class=3D"">uma_zfree_smr() are cached and&nbsp;</span><span =
style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" =
class=3D"">inpcb_dtor() is not called immediately ,</span></div><div =
class=3D""><span style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, =
0);" class=3D"">thus leaking `</span><span style=3D"caret-color: rgb(0, =
0, 0); color: rgb(0, 0, 0);" class=3D"">inp-&gt;inp_cred` ref and hence =
`prison-&gt;pr_ref`.</span></div><div class=3D""><span =
style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=3D""><br =
class=3D""></span></div><div class=3D""><span style=3D"caret-color: =
rgb(0, 0, 0); color: rgb(0, 0, 0);" class=3D"">And it is also not =
possible to free up the cache by per-VNET&nbsp;</span><font =
color=3D"#000000" class=3D""><span style=3D"caret-color: rgb(0, 0, 0);" =
class=3D"">SYSUNINIT&nbsp;tcp_destroy /&nbsp;udp_destroy =
/&nbsp;rip_destroy.</span></font></div><div class=3D""><font =
color=3D"#000000" class=3D""><span style=3D"caret-color: rgb(0, 0, 0);" =
class=3D""><br class=3D""></span></font></div><div class=3D""><font =
color=3D"#000000" class=3D""><span style=3D"caret-color: rgb(0, 0, 0);" =
class=3D""><br class=3D""></span></font></div><br class=3D""><div =
class=3D"">
<div>Best regards,</div><div>Zhenlei</div>

</div>
<br class=3D""><div><blockquote type=3D"cite" class=3D""><div =
class=3D"">On Dec 14, 2022, at 9:56 AM, Zhenlei Huang &lt;<a =
href=3D"mailto:zlei@FreeBSD.org" class=3D"">zlei@FreeBSD.org</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><meta =
http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii" =
class=3D""><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; line-break: after-white-space;" class=3D""><br class=3D""><div =
class=3D"">
<div class=3D"">Hi,</div><div class=3D""><br class=3D""></div><div =
class=3D"">I also encounter this problem while testing gif tunnel =
between jails.</div><div class=3D""><br class=3D""></div><div =
class=3D"">My script is similar but with additional gif =
tunnels.</div><div class=3D""><br class=3D""></div><div class=3D""><br =
class=3D""></div><div class=3D"">There are reports in mailing list [1], =
[2], and another one in forum [3] .</div><div class=3D""><br =
class=3D""></div><div class=3D"">Seem to be a long standing =
issue.</div><div class=3D""><br class=3D""></div><div =
class=3D"">[1]&nbsp;<a =
href=3D"https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/08=
6126.html" =
class=3D"">https://lists.freebsd.org/pipermail/freebsd-stable/2016-October=
/086126.html</a></div><div class=3D"">[2]&nbsp;<a =
href=3D"https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003357=
.html" =
class=3D"">https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003=
357.html</a></div><div class=3D"">[3]&nbsp;<a =
href=3D"https://forums.freebsd.org/threads/jails-stopping-prolonged-deaths=
-starting-networking-et-cetera.84200/" =
class=3D"">https://forums.freebsd.org/threads/jails-stopping-prolonged-dea=
ths-starting-networking-et-cetera.84200/</a></div><div class=3D""><br =
class=3D""></div><div class=3D""><br class=3D""></div></div><div =
class=3D"">
<div class=3D"">Best regards,</div><div class=3D"">Zhenlei</div>

</div>
<div class=3D""><br class=3D""><blockquote type=3D"cite" class=3D""><div =
class=3D"">On Dec 14, 2022, at 7:03 AM, Bjoern A. Zeeb &lt;<a =
href=3D"mailto:bz@FreeBSD.org" class=3D"">bz@FreeBSD.org</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><div =
class=3D"">Hi,<br class=3D""><br class=3D"">I have used scripts like the =
below for almost a decade and a half<br class=3D"">(obviously doing more =
than that in the middle). &nbsp;I haven't used them<br class=3D"">much =
lately but given other questions I just wanted to fire up a test.<br =
class=3D""><br class=3D"">I have an end-November kernel doing the below =
my eapirs do not come back<br class=3D"">to be destroyed =
(immediately).<br class=3D"">I have to start polling for the jid to be =
no longer alive and not in<br class=3D"">dying state (hence added the =
jls/ifconfig -l lines and removed the<br class=3D"">error checking from =
ifconfig destroy). &nbsp;That seems sometimes rather<br =
class=3D"">unreasonably long (to the point I give up).<br class=3D""><br =
class=3D"">If I don't configure the addresses below this isn't a =
problem.<br class=3D""><br class=3D"">Sorry I am confused by too many =
incarnations of the code; I know I once<br class=3D"">had a version with =
an async shutdown path but I believe that never made<br class=3D"">it =
into mainline, so why are we holding onto the epairs now and not<br =
class=3D"">nuking the addresses and returning them and are clean?<br =
class=3D""><br class=3D"">It's a bit more funny; I added a twiddle loop =
at the end and nothing<br class=3D"">happened. &nbsp;So I stop the =
script and start it again and suddenly another<br class=3D"">jail or two =
have cleaned up and their epairs are back. &nbsp;Something feels<br =
class=3D"">very very wonky. &nbsp;Play around with this and see ... and =
let me know if<br class=3D"">you can reproduce this... &nbsp;I quite =
wonder why some test cases haven't<br class=3D"">gone crazy ...<br =
class=3D""><br class=3D"">/bz<br class=3D""><br =
class=3D"">---------------------------------------------------------------=
---------<br class=3D"">#!/bin/sh<br class=3D""><br class=3D"">set -e<br =
class=3D"">set -x<br class=3D""><br class=3D"">js=3D`jail -i -c -n jl =
host.hostname=3D<a href=3D"http://left.example.net/" =
class=3D"">left.example.net</a> vnet persist`<br class=3D"">jb=3D`jail =
-i -c -n jr host.hostname=3D<a href=3D"http://right.example.net/" =
class=3D"">right.example.net</a> vnet persist`<br class=3D""><br =
class=3D""># Create an epair connecting the two machines (vnet =
jails).<br class=3D"">ep=3D`ifconfig epair create | sed -e 's/a$//'`<br =
class=3D""><br class=3D""># Add one end to each vnet jail.<br =
class=3D"">ifconfig ${ep}a vnet ${js}<br class=3D"">ifconfig ${ep}b vnet =
${jb}<br class=3D""><br class=3D""># Add an IP address on the epairs in =
each vnet jail.<br class=3D""># XXX Leave these out and the cleanup =
seems to work fine.<br class=3D"">jexec ${js} &nbsp;ifconfig ${ep}a inet =
&nbsp;192.0.2.1/24<br class=3D"">jexec ${jb} &nbsp;ifconfig ${ep}b inet =
&nbsp;192.0.2.2/24<br class=3D""><br class=3D""># Clean up.<br =
class=3D"">jail -r ${jb}<br class=3D"">jail -r ${js}<br class=3D""><br =
class=3D""># You want to be able to remove this line ...<br class=3D"">set=
 +e<br class=3D""><br class=3D""># No epairs to destroy with addresses =
configured; fine otherwise.<br class=3D"">ifconfig ${ep}a destroy<br =
class=3D""># echo $?<br class=3D""><br class=3D""># Add this is here =
only as things are funny ...<br class=3D""># jls -av jid dying<br =
class=3D""># ifconfig -l<br class=3D""><br class=3D""># end<br =
class=3D"">---------------------------------------------------------------=
---------<br class=3D""><br class=3D"">-- <br class=3D"">Bjoern A. Zeeb =
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;r15:7<br class=3D""><br =
class=3D""></div></div></blockquote></div><br =
class=3D""></div></div></blockquote></div><br class=3D""></body></html>=

--Apple-Mail=_2396E16C-866A-4D36-A1B5-6C4992A3A64A--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?150A60D6-6757-46DD-988F-05A9FFA36821>