Date: Sat, 20 Aug 2011 23:34:50 +0300 From: Andriy Gapon <avg@FreeBSD.org> To: Steven Hartland <killing@multiplay.co.uk> Cc: freebsd-jail@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: debugging frequent kernel panics on 8.2-RELEASE Message-ID: <4E501A6A.3030801@FreeBSD.org> In-Reply-To: <D34B2BC140E14E8A8D450474CCC39CB8@multiplay.co.uk> References: eBSD.org><82E865FBA30747078AF6EE3C1701F973@multiplay.co.uk><4E4FE55A.9000101@ FreeBSD.org> <B367F6CE151B4B6A8B0E048022D7F8AB@multiplay.co.uk> <D34B2BC140E14E8A8D450474CCC39CB8@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
on 20/08/2011 23:24 Steven Hartland said the following: > ----- Original Message ----- From: "Steven Hartland" <killing@multiplay.co.uk> >> Looking through the code I believe I may have noticed a scenario which could >> trigger the problem. >> >> Given the following code:- >> >> static void >> prison_deref(struct prison *pr, int flags) >> { >> struct prison *ppr, *tpr; >> int vfslocked; >> >> if (!(flags & PD_LOCKED)) >> mtx_lock(&pr->pr_mtx); >> /* Decrement the user references in a separate loop. */ >> if (flags & PD_DEUREF) { >> for (tpr = pr;; tpr = tpr->pr_parent) { >> if (tpr != pr) >> mtx_lock(&tpr->pr_mtx); >> if (--tpr->pr_uref > 0) >> break; >> KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); >> mtx_unlock(&tpr->pr_mtx); >> } >> /* Done if there were only user references to remove. */ >> if (!(flags & PD_DEREF)) { >> mtx_unlock(&tpr->pr_mtx); >> if (flags & PD_LIST_SLOCKED) >> sx_sunlock(&allprison_lock); >> else if (flags & PD_LIST_XLOCKED) >> sx_xunlock(&allprison_lock); >> return; >> } >> if (tpr != pr) { >> mtx_unlock(&tpr->pr_mtx); >> mtx_lock(&pr->pr_mtx); >> } >> } >> >> If you take a scenario of a simple one level prison setup running a single >> process >> where a prison has just been stopped. >> >> In the above code pr_uref of the processes prison is decremented. As this is the >> last process then pr_uref will hit 0 and the loop continues instead of breaking >> early. >> >> Now at the end of the loop iteration the mtx is unlocked so other process can >> now manipulate the jail, this is where I think the problem may be. >> >> If we now have another process come in and attach to the jail but then instantly >> exit, this process may allow another kernel thread to hit this same bit of code >> and so two process for the same prison get into the section which decrements >> prison0's pr_uref, instead of only one. >> >> In essence I think we can get the following flow where 1# = process1 >> and 2# = process2 >> 1#1. prison1.pr_uref = 1 (single process jail) >> 1#2. prison_deref( prison1,... >> 1#3. prison1.pr_uref-- (prison1.pr_uref = 0) >> 1#3. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref >> 1#3. prison0.pr_uref-- >> 2#1. process1.attach( prison1 ) (prison1.pr_uref = 1) >> 2#2. process1.exit >> 2#3. prison_deref( prison1,... >> 2#4. prison1.pr_uref-- (prison1.pr_uref = 0) >> 2#5. prison1.mtx_unlock <-- this now allows others to change prison1.pr_uref >> 2#5. prison0.pr_uref-- (prison1.pr_ref has now been decremented twice by prison1) >> >> It seems like the action on the parent prison to decrement the pr_uref is >> happening too early, while the jail can still be used and without the lock on >> the child jails mtx, so causing a race condition. >> >> I think the fix is to the move the decrement of parent prison pr_uref's down >> so it only takes place if the jail is "really" being removed. Either that or >> to change the locking semantics so that once the lock is aquired in this >> prison_deref its not unlocked until the function completes. >> >> What do people think? > > After reviewing the changes to prison_deref in commit which added hierarchical > jails, the removal of the lock by the inital loop on the passed in prison may > be unintentional. > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/kern_jail.c.diff?r1=1.101;r2=1.102;f=h > > > If so the following may be all that's needed to fix this issue:- > > diff -u sys/kern/kern_jail.c.orig sys/kern/kern_jail.c > --- sys/kern/kern_jail.c.orig 2011-08-20 21:17:14.856618854 +0100 > +++ sys/kern/kern_jail.c 2011-08-20 21:18:35.307201425 +0100 > @@ -2455,7 +2455,8 @@ > if (--tpr->pr_uref > 0) > break; > KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); > - mtx_unlock(&tpr->pr_mtx); > + if (tpr != pr) > + mtx_unlock(&tpr->pr_mtx); > } > /* Done if there were only user references to remove. */ > if (!(flags & PD_DEREF)) { Not sure if this would fly as is - please double check the later block where pr->pr_mtx is re-locked. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E501A6A.3030801>