Date: Fri, 16 Dec 2022 18:30:57 +0800 From: Zhenlei Huang <zlei.huang@gmail.com> To: "Bjoern A. Zeeb" <bz@FreeBSD.org> Cc: "freebsd-jail@freebsd.org" <freebsd-jail@FreeBSD.org>, Gleb Smirnoff <glebius@FreeBSD.org> Subject: Re: What's going on with vnets and epairs w/ addresses? Message-ID: <150A60D6-6757-46DD-988F-05A9FFA36821@FreeBSD.org> In-Reply-To: <B6C70A88-11F8-40D7-85E4-63BBA0F7931D@FreeBSD.org> References: <5r22os7n-ro15-27q-r356-rps331o06so5@mnoonqbm.arg> <B6C70A88-11F8-40D7-85E4-63BBA0F7931D@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_2396E16C-866A-4D36-A1B5-6C4992A3A64A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi, I managed to repeat this issue on CURRENT/14 with this small snip: ------------------------------------------- #!/bin/sh # test jail name n=3D"test_ref_leak" jail -c name=3D$n path=3D/ vnet persist # The following line trigger jail pr_ref leak jexec $n ifconfig lo0 inet 127.0.0.1/8 jail -R $n # wait a moment sleep 1 jls -j $n ------------------------------------------- After DDB debugging and tracing , it seems that is triggered by a = combine of [1] and [2] [1] = https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5ac895915 = <https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5ac895915> [2] = https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b175b8c5b = <https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b175b8c5b> In [1] the per-VNET uma zone is shared with the global one. `pcbinfo->ipi_zone =3D pcbstor->ips_zone;` In [2] unref `inp->inp_cred` is deferred called in inpcb_dtor() by = uma_zfree_smr() . Unfortunately inps freed by uma_zfree_smr() are cached and inpcb_dtor() = is not called immediately , thus leaking `inp->inp_cred` ref and hence `prison->pr_ref`. And it is also not possible to free up the cache by per-VNET SYSUNINIT = tcp_destroy / udp_destroy / rip_destroy. Best regards, Zhenlei > On Dec 14, 2022, at 9:56 AM, Zhenlei Huang <zlei@FreeBSD.org> wrote: >=20 >=20 > Hi, >=20 > I also encounter this problem while testing gif tunnel between jails. >=20 > My script is similar but with additional gif tunnels. >=20 >=20 > There are reports in mailing list [1], [2], and another one in forum = [3] . >=20 > Seem to be a long standing issue. >=20 > [1] = https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/086126.htm= l = <https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/086126.ht= ml> > [2] = https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003357.html = <https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003357.html> > [3] = https://forums.freebsd.org/threads/jails-stopping-prolonged-deaths-startin= g-networking-et-cetera.84200/ = <https://forums.freebsd.org/threads/jails-stopping-prolonged-deaths-starti= ng-networking-et-cetera.84200/> >=20 >=20 > Best regards, > Zhenlei >=20 >> On Dec 14, 2022, at 7:03 AM, Bjoern A. Zeeb <bz@FreeBSD.org = <mailto:bz@FreeBSD.org>> wrote: >>=20 >> Hi, >>=20 >> I have used scripts like the below for almost a decade and a half >> (obviously doing more than that in the middle). I haven't used them >> much lately but given other questions I just wanted to fire up a = test. >>=20 >> I have an end-November kernel doing the below my eapirs do not come = back >> to be destroyed (immediately). >> I have to start polling for the jid to be no longer alive and not in >> dying state (hence added the jls/ifconfig -l lines and removed the >> error checking from ifconfig destroy). That seems sometimes rather >> unreasonably long (to the point I give up). >>=20 >> If I don't configure the addresses below this isn't a problem. >>=20 >> Sorry I am confused by too many incarnations of the code; I know I = once >> had a version with an async shutdown path but I believe that never = made >> it into mainline, so why are we holding onto the epairs now and not >> nuking the addresses and returning them and are clean? >>=20 >> It's a bit more funny; I added a twiddle loop at the end and nothing >> happened. So I stop the script and start it again and suddenly = another >> jail or two have cleaned up and their epairs are back. Something = feels >> very very wonky. Play around with this and see ... and let me know = if >> you can reproduce this... I quite wonder why some test cases haven't >> gone crazy ... >>=20 >> /bz >>=20 >> = ------------------------------------------------------------------------ >> #!/bin/sh >>=20 >> set -e >> set -x >>=20 >> js=3D`jail -i -c -n jl host.hostname=3Dleft.example.net = <http://left.example.net/> vnet persist` >> jb=3D`jail -i -c -n jr host.hostname=3Dright.example.net = <http://right.example.net/> vnet persist` >>=20 >> # Create an epair connecting the two machines (vnet jails). >> ep=3D`ifconfig epair create | sed -e 's/a$//'` >>=20 >> # Add one end to each vnet jail. >> ifconfig ${ep}a vnet ${js} >> ifconfig ${ep}b vnet ${jb} >>=20 >> # Add an IP address on the epairs in each vnet jail. >> # XXX Leave these out and the cleanup seems to work fine. >> jexec ${js} ifconfig ${ep}a inet 192.0.2.1/24 >> jexec ${jb} ifconfig ${ep}b inet 192.0.2.2/24 >>=20 >> # Clean up. >> jail -r ${jb} >> jail -r ${js} >>=20 >> # You want to be able to remove this line ... >> set +e >>=20 >> # No epairs to destroy with addresses configured; fine otherwise. >> ifconfig ${ep}a destroy >> # echo $? >>=20 >> # Add this is here only as things are funny ... >> # jls -av jid dying >> # ifconfig -l >>=20 >> # end >> = ------------------------------------------------------------------------ >>=20 >> --=20 >> Bjoern A. Zeeb = r15:7 >>=20 >=20 --Apple-Mail=_2396E16C-866A-4D36-A1B5-6C4992A3A64A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; = charset=3Dus-ascii"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;" class=3D""><div = class=3D"">Hi,</div><div class=3D""><br class=3D""></div><div class=3D"">I= managed to repeat this issue on CURRENT/14 with this small = snip:</div><div class=3D""><br class=3D""></div><div = class=3D"">-------------------------------------------</div><div = class=3D""><div class=3D"">#!/bin/sh</div><div class=3D""><br = class=3D""></div><div class=3D""># test jail name</div><div = class=3D"">n=3D"test_ref_leak"</div><div class=3D""><br = class=3D""></div><div class=3D"">jail -c name=3D$n path=3D/ vnet = persist</div><div class=3D""># The following line trigger jail pr_ref = leak</div><div class=3D"">jexec $n ifconfig lo0 inet = 127.0.0.1/8</div><div class=3D""><br class=3D""></div><div class=3D"">jail= -R $n</div><div class=3D""><br class=3D""></div><div class=3D""># wait = a moment</div><div class=3D"">sleep 1</div><div class=3D""><br = class=3D""></div><div class=3D"">jls -j $n</div></div><div class=3D""><br = class=3D""></div><div class=3D""><br class=3D""></div><div = class=3D"">-------------------------------------------</div><div = class=3D""><br class=3D""></div><div class=3D""><br class=3D""></div><div = class=3D"">After DDB debugging and tracing , it seems that is triggered = by a combine of [1] and [2]</div><div class=3D""><br class=3D""></div><div= class=3D"">[1] <a = href=3D"https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5ac89= 5915" = class=3D"">https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5a= c895915</a></div><div class=3D"">[2] <a = href=3D"https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b175b= 8c5b" = class=3D"">https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b1= 75b8c5b</a></div><div class=3D""><br class=3D""></div><div class=3D""><br = class=3D""></div><div class=3D"">In [1] the per-VNET uma zone is shared = with the global one.</div><div class=3D"">`pcbinfo->ipi_zone =3D = pcbstor->ips_zone;`</div><div class=3D""><br class=3D""></div><div = class=3D"">In [2] unref `inp->inp_cred` is deferred called in = inpcb_dtor() by uma_zfree_smr() .</div><div class=3D""><br = class=3D""></div><div class=3D"">Unfortunately inps freed by <span = style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" = class=3D"">uma_zfree_smr() are cached and </span><span = style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" = class=3D"">inpcb_dtor() is not called immediately ,</span></div><div = class=3D""><span style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, = 0);" class=3D"">thus leaking `</span><span style=3D"caret-color: rgb(0, = 0, 0); color: rgb(0, 0, 0);" class=3D"">inp->inp_cred` ref and hence = `prison->pr_ref`.</span></div><div class=3D""><span = style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=3D""><br = class=3D""></span></div><div class=3D""><span style=3D"caret-color: = rgb(0, 0, 0); color: rgb(0, 0, 0);" class=3D"">And it is also not = possible to free up the cache by per-VNET </span><font = color=3D"#000000" class=3D""><span style=3D"caret-color: rgb(0, 0, 0);" = class=3D"">SYSUNINIT tcp_destroy / udp_destroy = / rip_destroy.</span></font></div><div class=3D""><font = color=3D"#000000" class=3D""><span style=3D"caret-color: rgb(0, 0, 0);" = class=3D""><br class=3D""></span></font></div><div class=3D""><font = color=3D"#000000" class=3D""><span style=3D"caret-color: rgb(0, 0, 0);" = class=3D""><br class=3D""></span></font></div><br class=3D""><div = class=3D""> <div>Best regards,</div><div>Zhenlei</div> </div> <br class=3D""><div><blockquote type=3D"cite" class=3D""><div = class=3D"">On Dec 14, 2022, at 9:56 AM, Zhenlei Huang <<a = href=3D"mailto:zlei@FreeBSD.org" class=3D"">zlei@FreeBSD.org</a>> = wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><meta = http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii" = class=3D""><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: = space; line-break: after-white-space;" class=3D""><br class=3D""><div = class=3D""> <div class=3D"">Hi,</div><div class=3D""><br class=3D""></div><div = class=3D"">I also encounter this problem while testing gif tunnel = between jails.</div><div class=3D""><br class=3D""></div><div = class=3D"">My script is similar but with additional gif = tunnels.</div><div class=3D""><br class=3D""></div><div class=3D""><br = class=3D""></div><div class=3D"">There are reports in mailing list [1], = [2], and another one in forum [3] .</div><div class=3D""><br = class=3D""></div><div class=3D"">Seem to be a long standing = issue.</div><div class=3D""><br class=3D""></div><div = class=3D"">[1] <a = href=3D"https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/08= 6126.html" = class=3D"">https://lists.freebsd.org/pipermail/freebsd-stable/2016-October= /086126.html</a></div><div class=3D"">[2] <a = href=3D"https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003357= .html" = class=3D"">https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003= 357.html</a></div><div class=3D"">[3] <a = href=3D"https://forums.freebsd.org/threads/jails-stopping-prolonged-deaths= -starting-networking-et-cetera.84200/" = class=3D"">https://forums.freebsd.org/threads/jails-stopping-prolonged-dea= ths-starting-networking-et-cetera.84200/</a></div><div class=3D""><br = class=3D""></div><div class=3D""><br class=3D""></div></div><div = class=3D""> <div class=3D"">Best regards,</div><div class=3D"">Zhenlei</div> </div> <div class=3D""><br class=3D""><blockquote type=3D"cite" class=3D""><div = class=3D"">On Dec 14, 2022, at 7:03 AM, Bjoern A. Zeeb <<a = href=3D"mailto:bz@FreeBSD.org" class=3D"">bz@FreeBSD.org</a>> = wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><div = class=3D"">Hi,<br class=3D""><br class=3D"">I have used scripts like the = below for almost a decade and a half<br class=3D"">(obviously doing more = than that in the middle). I haven't used them<br class=3D"">much = lately but given other questions I just wanted to fire up a test.<br = class=3D""><br class=3D"">I have an end-November kernel doing the below = my eapirs do not come back<br class=3D"">to be destroyed = (immediately).<br class=3D"">I have to start polling for the jid to be = no longer alive and not in<br class=3D"">dying state (hence added the = jls/ifconfig -l lines and removed the<br class=3D"">error checking from = ifconfig destroy). That seems sometimes rather<br = class=3D"">unreasonably long (to the point I give up).<br class=3D""><br = class=3D"">If I don't configure the addresses below this isn't a = problem.<br class=3D""><br class=3D"">Sorry I am confused by too many = incarnations of the code; I know I once<br class=3D"">had a version with = an async shutdown path but I believe that never made<br class=3D"">it = into mainline, so why are we holding onto the epairs now and not<br = class=3D"">nuking the addresses and returning them and are clean?<br = class=3D""><br class=3D"">It's a bit more funny; I added a twiddle loop = at the end and nothing<br class=3D"">happened. So I stop the = script and start it again and suddenly another<br class=3D"">jail or two = have cleaned up and their epairs are back. Something feels<br = class=3D"">very very wonky. Play around with this and see ... and = let me know if<br class=3D"">you can reproduce this... I quite = wonder why some test cases haven't<br class=3D"">gone crazy ...<br = class=3D""><br class=3D"">/bz<br class=3D""><br = class=3D"">---------------------------------------------------------------= ---------<br class=3D"">#!/bin/sh<br class=3D""><br class=3D"">set -e<br = class=3D"">set -x<br class=3D""><br class=3D"">js=3D`jail -i -c -n jl = host.hostname=3D<a href=3D"http://left.example.net/" = class=3D"">left.example.net</a> vnet persist`<br class=3D"">jb=3D`jail = -i -c -n jr host.hostname=3D<a href=3D"http://right.example.net/" = class=3D"">right.example.net</a> vnet persist`<br class=3D""><br = class=3D""># Create an epair connecting the two machines (vnet = jails).<br class=3D"">ep=3D`ifconfig epair create | sed -e 's/a$//'`<br = class=3D""><br class=3D""># Add one end to each vnet jail.<br = class=3D"">ifconfig ${ep}a vnet ${js}<br class=3D"">ifconfig ${ep}b vnet = ${jb}<br class=3D""><br class=3D""># Add an IP address on the epairs in = each vnet jail.<br class=3D""># XXX Leave these out and the cleanup = seems to work fine.<br class=3D"">jexec ${js} ifconfig ${ep}a inet = 192.0.2.1/24<br class=3D"">jexec ${jb} ifconfig ${ep}b inet = 192.0.2.2/24<br class=3D""><br class=3D""># Clean up.<br = class=3D"">jail -r ${jb}<br class=3D"">jail -r ${js}<br class=3D""><br = class=3D""># You want to be able to remove this line ...<br class=3D"">set= +e<br class=3D""><br class=3D""># No epairs to destroy with addresses = configured; fine otherwise.<br class=3D"">ifconfig ${ep}a destroy<br = class=3D""># echo $?<br class=3D""><br class=3D""># Add this is here = only as things are funny ...<br class=3D""># jls -av jid dying<br = class=3D""># ifconfig -l<br class=3D""><br class=3D""># end<br = class=3D"">---------------------------------------------------------------= ---------<br class=3D""><br class=3D"">-- <br class=3D"">Bjoern A. Zeeb = &n= bsp; &nbs= p; = &n= bsp; r15:7<br class=3D""><br = class=3D""></div></div></blockquote></div><br = class=3D""></div></div></blockquote></div><br class=3D""></body></html>= --Apple-Mail=_2396E16C-866A-4D36-A1B5-6C4992A3A64A--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?150A60D6-6757-46DD-988F-05A9FFA36821>