Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Aug 2011 13:02:25 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        freebsd-jail@FreeBSD.org, freebsd-stable@FreeBSD.org
Subject:   Re: debugging frequent kernel panics on 8.2-RELEASE
Message-ID:  <4E4F8631.1070300@FreeBSD.org>
In-Reply-To: <4019027648B5493AAC4B654BD821DE88@multiplay.co.uk>
References:  <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk><A71C3ACF01EC4D36871E49805C1A5321@multiplay.co.uk><4E4380C0.7070908@FreeBSD.org><EBC06A239BAB4B3293C28D793329F9CA@multiplay.co.uk><4E43E272.1060204@FreeBSD.org><62BF25D0ED914876BEE75E2ADF28DDF7@multiplay.co.uk><4E440865.1040500@FreeBSD.org><6F08A8DE780545ADB9FA93B0A8AA4DA1@multiplay.co.uk><4E441314.6060606@FreeBSD.org><2C4B0D05C8924F24A73B56EA652FA4B0@multiplay.co.uk><4E48D967.9060804@FreeBSD.org><9D034F992B064E8092E5D1D249B3E959@multiplay.co.uk><4E490DAF.1080009@FreeBSD.org><796FD5A096DE4558B57338A8FA1E125B@multiplay.co.uk><4E491D01.1090902@FreeBSD.org><570C5495A5E242F7946E806CA7AC5D68@multiplay.co.uk><4E4AD35C.7020504@FreeBSD.org><6A7238AED44542A880B082A40304D940@multiplay.co.uk><4E4BA21F.6010805@FreeBSD.org><581C95046B0948FC82D6F2E86948F87B@multiplay.co.uk><4E4BBA7F.30907@FreeBSD.org><88A6CE3E8B174E0694A3A9A5283479B4@multiplay.co.uk> <4E4C22D6.6070407@FreeBSD.org> <4019027648B5493AAC4B654BD821DE88@multiplay.co.! uk>

next in thread | previous in thread | raw e-mail | index | archive | help
on 18/08/2011 02:15 Steven Hartland said the following:
> In a nutshell the jail manager we're using will attempt to resurrect the jail
> from a dieing state in a few specific scenarios.
> 
> Here's an exmaple:-
> 1. jail restart requested
> 2. jail is stopped, so the java processes is killed off, but active tcp sessions
> may prevent the timely full shutdown of the jail.
> 3. if an existing jail is detected, i.e. a dieing jail from #2, instead of
> starting a new jail we attach to the old one and exec the new java process.
> 4. if an existing jail isnt detected, i.e. where there where not hanging tcp
> sessions and #2 cleanly shutdown the jail, a new jail is created, attached to
> and the java exec'ed.
> 
> The system uses static jailid's so its possible to determine if an existing
> jail for this "service" exists or not. This prevents duplicate services as
> well as making services easy to identify by their jailid.
> 
> So what we could be seeing is a race between the jail shutdown and the attach
> of the new process?

Not a jail expert at all, but a few suggestions...

First, wouldn't the 'persist' jail option simplify your life a little bit?

Second, you may want to try to monitor value of prison0.pr_uref variable (e.g.
via kgdb) while executing various scenarios of what you do now.  If after
finishing a certain scenario you end up with a value lower than at the start of
scenario, then this is the troublesome one.
Please note that prison0.pr_uref is composed from a number of non-jailed
processes plus a number of top-level jails.  So take this into account when
comparing prison0.pr_uref values - it's better to record the initial value when
no jails are started and it's important to keep the number of non-jailed
processes the same (or to account for its changes).

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E4F8631.1070300>