Date: Thu, 18 Aug 2011 17:31:10 +0300 From: Andriy Gapon <avg@FreeBSD.org> To: Steven Hartland <killing@multiplay.co.uk>, freebsd-jail@FreeBSD.org Cc: freebsd-hackers <freebsd-hackers@FreeBSD.org>, freebsd-stable@FreeBSD.org Subject: Re: debugging frequent kernel panics on 8.2-RELEASE Message-ID: <4E4D222E.2090802@FreeBSD.org> In-Reply-To: <4E4CF347.6030908@FreeBSD.org> References: uk> <4E4CD98C.1000301@FreeBSD.org> <F4663E06BEED4401916C0AEAA16DD40E@multiplay.co.uk> <4E4CF347.6030908@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
on 18/08/2011 14:11 Andriy Gapon said the following: > Probably I have mistakenly assumed that the 'prison' in prison_derefer() has > something to do with an actual jail, while it could have been just prison0 where > all non-jailed processes belong. So, indeed: (kgdb) p $2->p_ucred->cr_prison $10 = (struct prison *) 0xffffffff807d5080 (kgdb) p &prison0 $11 = (struct prison *) 0xffffffff807d5080 (kgdb) p *$2->p_ucred->cr_prison $12 = {pr_list = {tqe_next = 0x0, tqe_prev = 0x0}, pr_id = 0, pr_ref = 398, pr_uref = 0, pr_flags = 386, pr_children = {lh_first = 0x0}, pr_sibling = {le_next = 0x0, le_prev = 0x0}, pr_parent = 0x0, pr_mtx = {lock_object = {lo_name = 0xffffffff8063007c "jail mutex", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, pr_task = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0, ta_context = 0x0}, pr_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, pr_cpuset = 0xffffff0012d65dc8, pr_vnet = 0x0, pr_root = 0xffffff00166ebce8, pr_ip4s = 0, pr_ip6s = 0, pr_ip4 = 0x0, pr_ip6 = 0x0, pr_sparep = {0x0, 0x0, 0x0, 0x0}, pr_childcount = 0, pr_childmax = 999999, pr_allow = 127, pr_securelevel = -1, pr_enforce_statfs = 0, pr_spare = {0, 0, 0, 0, 0}, pr_hostid = 3251597242, pr_name = "0", '\0' <repeats 254 times>, pr_path = "/", '\0' <repeats 1022 times>, pr_hostname = "censored", '\0' <repeats 231 times>, pr_domainname = '\0' <repeats 255 times>, pr_hostuuid = "54443842-0054-2500-902c-0025902c3cb0", '\0' <repeats 27 times>} Also, let's consider this code: if (flags & PD_DEUREF) { for (tpr = pr;; tpr = tpr->pr_parent) { if (tpr != pr) mtx_lock(&tpr->pr_mtx); if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { mtx_unlock(&tpr->pr_mtx); if (flags & PD_LIST_SLOCKED) sx_sunlock(&allprison_lock); else if (flags & PD_LIST_XLOCKED) sx_xunlock(&allprison_lock); return; } if (tpr != pr) { mtx_unlock(&tpr->pr_mtx); mtx_lock(&pr->pr_mtx); } } The most suspicious thing is that pr_uref is zero in the debug data. With INVARIANTS we would hit the "prison0 pr_uref=0" KASSERT. Then, because this is prison0 and because pr_uref reached zero, tpr gets assigned to NULL. And then because tpr != pr we try to execute mtx_unlock(&tpr->pr_mtx). That's where the NULL pointer deref happens. So, now the big question is how/why we reached pr_uref == 0. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E4D222E.2090802>