From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 18 14:31:28 2011 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E2C4F106566B; Thu, 18 Aug 2011 14:31:28 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id B50648FC1F; Thu, 18 Aug 2011 14:31:27 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA15396; Thu, 18 Aug 2011 17:31:23 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <4E4D222E.2090802@FreeBSD.org> Date: Thu, 18 Aug 2011 17:31:10 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:5.0) Gecko/20110705 Thunderbird/5.0 MIME-Version: 1.0 To: Steven Hartland , freebsd-jail@FreeBSD.org References: uk> <4E4CD98C.1000301@FreeBSD.org> <4E4CF347.6030908@FreeBSD.org> In-Reply-To: <4E4CF347.6030908@FreeBSD.org> X-Enigmail-Version: 1.2pre Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-hackers , freebsd-stable@FreeBSD.org Subject: Re: debugging frequent kernel panics on 8.2-RELEASE X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Aug 2011 14:31:29 -0000 on 18/08/2011 14:11 Andriy Gapon said the following: > Probably I have mistakenly assumed that the 'prison' in prison_derefer() has > something to do with an actual jail, while it could have been just prison0 where > all non-jailed processes belong. So, indeed: (kgdb) p $2->p_ucred->cr_prison $10 = (struct prison *) 0xffffffff807d5080 (kgdb) p &prison0 $11 = (struct prison *) 0xffffffff807d5080 (kgdb) p *$2->p_ucred->cr_prison $12 = {pr_list = {tqe_next = 0x0, tqe_prev = 0x0}, pr_id = 0, pr_ref = 398, pr_uref = 0, pr_flags = 386, pr_children = {lh_first = 0x0}, pr_sibling = {le_next = 0x0, le_prev = 0x0}, pr_parent = 0x0, pr_mtx = {lock_object = {lo_name = 0xffffffff8063007c "jail mutex", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, pr_task = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0, ta_context = 0x0}, pr_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, pr_cpuset = 0xffffff0012d65dc8, pr_vnet = 0x0, pr_root = 0xffffff00166ebce8, pr_ip4s = 0, pr_ip6s = 0, pr_ip4 = 0x0, pr_ip6 = 0x0, pr_sparep = {0x0, 0x0, 0x0, 0x0}, pr_childcount = 0, pr_childmax = 999999, pr_allow = 127, pr_securelevel = -1, pr_enforce_statfs = 0, pr_spare = {0, 0, 0, 0, 0}, pr_hostid = 3251597242, pr_name = "0", '\0' , pr_path = "/", '\0' , pr_hostname = "censored", '\0' , pr_domainname = '\0' , pr_hostuuid = "54443842-0054-2500-902c-0025902c3cb0", '\0' } Also, let's consider this code: if (flags & PD_DEUREF) { for (tpr = pr;; tpr = tpr->pr_parent) { if (tpr != pr) mtx_lock(&tpr->pr_mtx); if (--tpr->pr_uref > 0) break; KASSERT(tpr != &prison0, ("prison0 pr_uref=0")); mtx_unlock(&tpr->pr_mtx); } /* Done if there were only user references to remove. */ if (!(flags & PD_DEREF)) { mtx_unlock(&tpr->pr_mtx); if (flags & PD_LIST_SLOCKED) sx_sunlock(&allprison_lock); else if (flags & PD_LIST_XLOCKED) sx_xunlock(&allprison_lock); return; } if (tpr != pr) { mtx_unlock(&tpr->pr_mtx); mtx_lock(&pr->pr_mtx); } } The most suspicious thing is that pr_uref is zero in the debug data. With INVARIANTS we would hit the "prison0 pr_uref=0" KASSERT. Then, because this is prison0 and because pr_uref reached zero, tpr gets assigned to NULL. And then because tpr != pr we try to execute mtx_unlock(&tpr->pr_mtx). That's where the NULL pointer deref happens. So, now the big question is how/why we reached pr_uref == 0. -- Andriy Gapon