From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 18 20:09:38 2011 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 807AB1065672; Thu, 18 Aug 2011 20:09:38 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 786778FC0A; Thu, 18 Aug 2011 20:09:37 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA18855; Thu, 18 Aug 2011 23:09:36 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Qu8uF-000H9C-QY; Thu, 18 Aug 2011 23:09:35 +0300 Message-ID: <4E4D717F.3090802@FreeBSD.org> Date: Thu, 18 Aug 2011 23:09:35 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0) Gecko/20110817 Thunderbird/6.0 MIME-Version: 1.0 To: freebsd-hackers@FreeBSD.org References: <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk><4E4380C0.7070908@FreeBSD.org> <4E43E272.1060204@FreeBSD.org> <62BF25D0ED914876BEE75E2ADF28DDF7@multiplay.co.uk> <4E440865.1040500@FreeBSD.org> <6F08A8DE780545ADB9FA93B0A8AA4DA1@multiplay.co.uk> <4E441314.6060606@FreeBSD.org> <2C4B0D05C8924F24A73B56EA652FA4B0@multiplay.co.uk> <4E48D967.9060804@FreeBSD.org> <9D034F992B064E8092E5D1D249B3E959@multiplay.co.uk> <4E490DAF.1080009@FreeBSD.org> <796FD5A096DE4558B57338A8FA1E125B@multiplay.co.uk> <4E491D01.1090902@FreeBSD.org> <570C5495A5E242F7946E806CA7AC5D68@multiplay.co.uk> <4E4AD35C.7020504@FreeBSD.org> <6A7238AED44542A880B082A40304D940@multiplay.co.uk> <4E4BA21F.6010805@FreeBSD.org> <581C95046B0948FC82D6F2E86948F87B@multiplay.co.uk> <4E4BBA7F.30907@FreeBSD.org> <88A6CE3E8B174E0694A3A9A5283479B4@multiplay.co.uk> <4E4C22D6.6070407@FreeBSD.org> In-Reply-To: <4E4C22D6.6070407@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@FreeBSD.org Subject: Re: debugging frequent kernel panics on 8.2-RELEASE X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Aug 2011 20:09:38 -0000 on 17/08/2011 23:21 Andriy Gapon said the following: > It seems like everything starts with some kind of a race between terminating > processes in a jail and termination of the jail itself. This is where the > details are very thin so far. What we see is that a process (http) is in > exit(2) syscall, in exit1() function actually, and past the place where P_WEXIT > flag is set and even past the place where p_limit is freed and reset to NULL. > At that place the thread calls prison_proc_free(), which calls prison_deref(). > Then, we see that in prison_deref() the thread gets a page fault because of what > seems like a NULL pointer dereference. That's just the start of the problem and > its root cause. > > Then, trap_pfault() gets invoked and, because addresses close to NULL look like > userspace addresses, vm_fault/vm_fault_hold gets called, which in its turn goes > on to call vm_map_growstack. First thing that vm_map_growstack does is a call > to lim_cur(), but because p_limit is already NULL, that call results in a NULL > pointer dereference and a page fault. Goto the beginning of this paragraph. > > So we get this recursion of sorts, which only ends when a stack is exhausted and > a CPU generates a double-fault. BTW, does anyone has an idea why the thread in question would "disappear" from the kgdb's point of view? (kgdb) p cpuid_to_pcpu[2]->pc_curthread->td_tid $3 = 102057 (kgdb) tid 102057 invalid tid info threads also doesn't list the thread. Is it because the panic happened while the thread was somewhere in exit1()? is there an easy way to examine its stack in this case? -- Andriy Gapon