Date: Wed, 02 Nov 2016 08:16:04 -0600 From: James Gritton <jamie@gritton.org> To: jail@freebsd.org Subject: Re: Debugging jails in dying state Message-ID: <1c2ce2d106246aa2b0d00c4c7387489c@gritton.org> In-Reply-To: <20161102124521.i57bpmp3w3ql333h@ivaldir.etoilebsd.net> References: <20161102124521.i57bpmp3w3ql333h@ivaldir.etoilebsd.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2016-11-02 06:45, Baptiste Daroussin wrote: > > Is there a way to debug/trace what a jail is doing in dying state. > > I have a couple of jails that takes very long in dying state even after > all > processes and tcp connections are dead. > > I can't find a way to figure what it is waiting for. > > Any clue? > > By very long I mean up to 20min! A dead prison has a nonzero pr_ref (and a zero pr_uref), so that's what you want to keep an eye on. The functions to change that field are prison_hold[_locked] and prison_free[_locked]. If you're actually running a kernel debugger (which I've never done outside of a crash dump), you should be able to catch a stack trace on prison_free to see who's finally letting the last reference go. It turns out that there are very few places that call these functions on anything besides prison0 (and nothing outside of kern_jail.c twiddles the field directly): moving interfaces around between vimage jails, setting up a zfs zone, and in crcopy/crfree. Its that last one that you'll need to trace, because of course creds are everywhere and anything that holds on to a jail until some future point is doing that by holding on to a cred. So the good news is that you can use whatever tools you already have in your possession to trace creds. And the bad news is that creds are everywhere :-). Aside from TCP connections, NFS seems to be a common dying jail timeout. And perhaps ZFS - I don't recall. - Jamie
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1c2ce2d106246aa2b0d00c4c7387489c>