Date: Fri, 16 Dec 2022 22:55:41 +0000 (UTC) From: "Bjoern A. Zeeb" <bz@FreeBSD.org> To: Zhenlei Huang <zlei.huang@gmail.com> Cc: "freebsd-jail@freebsd.org" <freebsd-jail@FreeBSD.org>, Gleb Smirnoff <glebius@FreeBSD.org> Subject: Re: What's going on with vnets and epairs w/ addresses? Message-ID: <1348s3p2-783s-sno2-pp6-rs9oq0s921n@SerrOFQ.bet> In-Reply-To: <150A60D6-6757-46DD-988F-05A9FFA36821@FreeBSD.org> References: <5r22os7n-ro15-27q-r356-rps331o06so5@mnoonqbm.arg> <B6C70A88-11F8-40D7-85E4-63BBA0F7931D@FreeBSD.org> <150A60D6-6757-46DD-988F-05A9FFA36821@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 16 Dec 2022, Zhenlei Huang wrote: Hi, > I managed to repeat this issue on CURRENT/14 with this small snip: > > ------------------------------------------- > #!/bin/sh > > # test jail name > n="test_ref_leak" > > jail -c name=$n path=/ vnet persist > # The following line trigger jail pr_ref leak > jexec $n ifconfig lo0 inet 127.0.0.1/8 > > jail -R $n > > # wait a moment > sleep 1 > > jls -j $n > > > ------------------------------------------- > > > After DDB debugging and tracing , it seems that is triggered by a combine of [1] and [2] > > [1] https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5ac895915 <https://reviews.freebsd.org/rGfec8a8c7cbe4384c7e61d376f3aa5be5ac895915> > [2] https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b175b8c5b <https://reviews.freebsd.org/rGeb93b99d698674e3b1cc7139fda98e2b175b8c5b> > > > In [1] the per-VNET uma zone is shared with the global one. > `pcbinfo->ipi_zone = pcbstor->ips_zone;` > > In [2] unref `inp->inp_cred` is deferred called in inpcb_dtor() by uma_zfree_smr() . > > Unfortunately inps freed by uma_zfree_smr() are cached and inpcb_dtor() is not called immediately , > thus leaking `inp->inp_cred` ref and hence `prison->pr_ref`. > > And it is also not possible to free up the cache by per-VNET SYSUNINIT tcp_destroy / udp_destroy / rip_destroy. Thanks a lot for tracking it down. That seems to be a regression then that needs to be fixed before 14.0-RELEASE will happen as it'll break management utilities of people. Could you open a bug report and flag it as such? /bz > > > Best regards, > Zhenlei > >> On Dec 14, 2022, at 9:56 AM, Zhenlei Huang <zlei@FreeBSD.org> wrote: >> >> >> Hi, >> >> I also encounter this problem while testing gif tunnel between jails. >> >> My script is similar but with additional gif tunnels. >> >> >> There are reports in mailing list [1], [2], and another one in forum [3] . >> >> Seem to be a long standing issue. >> >> [1] https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/086126.html <https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/086126.html> >> [2] https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003357.html <https://lists.freebsd.org/pipermail/freebsd-jail/2017-March/003357.html> >> [3] https://forums.freebsd.org/threads/jails-stopping-prolonged-deaths-starting-networking-et-cetera.84200/ <https://forums.freebsd.org/threads/jails-stopping-prolonged-deaths-starting-networking-et-cetera.84200/> >> >> >> Best regards, >> Zhenlei >> >>> On Dec 14, 2022, at 7:03 AM, Bjoern A. Zeeb <bz@FreeBSD.org <mailto:bz@FreeBSD.org>> wrote: >>> >>> Hi, >>> >>> I have used scripts like the below for almost a decade and a half >>> (obviously doing more than that in the middle). I haven't used them >>> much lately but given other questions I just wanted to fire up a test. >>> >>> I have an end-November kernel doing the below my eapirs do not come back >>> to be destroyed (immediately). >>> I have to start polling for the jid to be no longer alive and not in >>> dying state (hence added the jls/ifconfig -l lines and removed the >>> error checking from ifconfig destroy). That seems sometimes rather >>> unreasonably long (to the point I give up). >>> >>> If I don't configure the addresses below this isn't a problem. >>> >>> Sorry I am confused by too many incarnations of the code; I know I once >>> had a version with an async shutdown path but I believe that never made >>> it into mainline, so why are we holding onto the epairs now and not >>> nuking the addresses and returning them and are clean? >>> >>> It's a bit more funny; I added a twiddle loop at the end and nothing >>> happened. So I stop the script and start it again and suddenly another >>> jail or two have cleaned up and their epairs are back. Something feels >>> very very wonky. Play around with this and see ... and let me know if >>> you can reproduce this... I quite wonder why some test cases haven't >>> gone crazy ... >>> >>> /bz >>> >>> ------------------------------------------------------------------------ >>> #!/bin/sh >>> >>> set -e >>> set -x >>> >>> js=`jail -i -c -n jl host.hostname=left.example.net <http://left.example.net/> vnet persist` >>> jb=`jail -i -c -n jr host.hostname=right.example.net <http://right.example.net/> vnet persist` >>> >>> # Create an epair connecting the two machines (vnet jails). >>> ep=`ifconfig epair create | sed -e 's/a$//'` >>> >>> # Add one end to each vnet jail. >>> ifconfig ${ep}a vnet ${js} >>> ifconfig ${ep}b vnet ${jb} >>> >>> # Add an IP address on the epairs in each vnet jail. >>> # XXX Leave these out and the cleanup seems to work fine. >>> jexec ${js} ifconfig ${ep}a inet 192.0.2.1/24 >>> jexec ${jb} ifconfig ${ep}b inet 192.0.2.2/24 >>> >>> # Clean up. >>> jail -r ${jb} >>> jail -r ${js} >>> >>> # You want to be able to remove this line ... >>> set +e >>> >>> # No epairs to destroy with addresses configured; fine otherwise. >>> ifconfig ${ep}a destroy >>> # echo $? >>> >>> # Add this is here only as things are funny ... >>> # jls -av jid dying >>> # ifconfig -l >>> >>> # end >>> ------------------------------------------------------------------------ >>> >>> -- >>> Bjoern A. Zeeb r15:7 >>> >> > > -- Bjoern A. Zeeb r15:7
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1348s3p2-783s-sno2-pp6-rs9oq0s921n>