Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 21 Aug 2011 12:01:34 +0100
From:      "Steven Hartland" <killing@multiplay.co.uk>
To:        "Jamie Gritton" <jamie@FreeBSD.org>
Cc:        "Bjoern A. Zeeb" <bz@FreeBSD.org>, freebsd-jail@FreeBSD.org, freebsd-stable@FreeBSD.org, Andriy Gapon <avg@FreeBSD.org>
Subject:   Re: debugging frequent kernel panics on 8.2-RELEASE
Message-ID:  <A83D4882F48A432591DC9C05DEF226A0@multiplay.co.uk>
References:  eBSD.org><82E865FBA30747078AF6EE3C1701F973@multiplay.co.uk><4E4FE55A.9000101@FreeBSD.org> <B367F6CE151B4B6A8B0E048022D7F8AB@multiplay.co.uk><D34B2BC140E14E8A8D450474CCC39CB8@multiplay.co.uk> <4E501A6A.3030801@FreeBSD.org> <3D52B19B71CA49A4BB3BCCDF25961F46@multiplay.co.uk> <4E50ADC5.6050403@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
----- Original Message ----- 
From: "Jamie Gritton" <jamie@FreeBSD.org>
>>>>> In essence I think we can get the following flow where 1# = process1
>>>>> and 2# = process2
>>>>> 1#1. prison1.pr_uref = 1 (single process jail)
>>>>> 1#2. prison_deref( prison1,...
>>>>> 1#3. prison1.pr_uref-- (prison1.pr_uref = 0)
>>>>> 1#3. prison1.mtx_unlock <-- this now allows others to change
>>>>> prison1.pr_uref
>>>>> 1#3. prison0.pr_uref--
>>>>> 2#1. process1.attach( prison1 ) (prison1.pr_uref = 1)
>>>>> 2#2. process1.exit
>>>>> 2#3. prison_deref( prison1,...
>>>>> 2#4. prison1.pr_uref-- (prison1.pr_uref = 0)
>>>>> 2#5. prison1.mtx_unlock <-- this now allows others to change
>>>>> prison1.pr_uref
>>>>> 2#5. prison0.pr_uref-- (prison1.pr_ref has now been decremented
>>>>> twice by prison1)

First off thanks for the feedback Jamie most appreciated :)

> The problem isn't with the conditional locking of tpr in prison_deref.
> That locking is actually correct, and there's no race condition.

Are you sure? I do think that unlocking the mtx half way through the
call allows the above scenario to create a race condition, all be it
very briefly, when ignoring the overriding issue.

In addition if the code where changed to so that the pr_uref++ also
maintained the parents uref this would definitely lead to a potential
problems in my mind, especially if you had more than one child prison,
of a given parent, entering the dying state at any one time.

In this case I believe you would have to acquire the locks of all
the parent prisons before it would be safe to precede.

> The trouble lies in the resurrection of dead jails, as Andriy has noted
> (though not just attaching, but also by setting its persist flag causes
> the same problem).

I not sure that persistent prisons actually suffer from this in any
different way tbh, as they have an additional uref increment so would
never hit this case unless they have been actively removed and hence
unpersisted first.


> There are two possible fixes to this. One is the patch you've given,
> which only decrements a parent jail's pr_uref when the child jail
> completely goes away (as opposed to when it loses its last uref). This
> provides symmetry with the current way pr_uref is incremented on the
> parent, which is only when a jail is created.
> 
> The other fix is to increment a parent's pr_uref when a jail is
> resurrected, which will match the current logic in prison_deref. I like
> the external semantics of this solution: a jail isn't visible if it is
> not persistent and has no processes and no *visible* sub-jails, as
> opposed to having no sub-jails at all. But this solution ends up pretty
> complicated - there are a few places where pr_uref is incremented, where
> I might need to increment parent jails' pr_uref as well, much like the
> current tpr loop in prison_deref decrements them.

Ahh yes in the hierarchical case my patch would indeed mean that none
persistent parent jails would remain visible even when its last child
jail is in a dying state.

As you say making this not the case would likely require replacing all
instances of pr_uref++ with a prison_uref method that implements the
opposite of the loop in prison_dref should the prisons pr_uref be 0 when
called.

> Your solution removes code instead of adding it, which is generally a
> good thing. While it does change the semantics of pr_uref in the
> hierarchical case at least from what I thought it was, those semantics
> haven't been working properly anyway.

Good to know my interpretation was correct, even if I was missing the
visibility factor in the hierarchical case :)

> Bjoern, I'm adding you to the CC list for this because the whole pr_uref
> thing was your idea (though it was pr_nprocs at the time), so you might
> care about the hierarchical semantics of it - or you may not. Also, this
> is a panic-inducing bug in current and may interest you for that reason.

>From an admin perspective the current jail dying state does cause
confusion when your not aware of its existence. You ask a jail to stop it
appears to have completed that request, but really hasn't, an generally
due to just a lingering tcp connection.

With the introduction of hierarchical jails that gets a little worse
where a whole series of jails could disappear from normal view only to
be resurrected shortly after. Something to bear in mind when deciding
which solution of the two presented to use.

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A83D4882F48A432591DC9C05DEF226A0>