From owner-freebsd-stable@FreeBSD.ORG Wed Aug 17 20:21:48 2011 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DA4E8106567A; Wed, 17 Aug 2011 20:21:47 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 339038FC20; Wed, 17 Aug 2011 20:21:45 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA00935; Wed, 17 Aug 2011 23:21:43 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1QtmcR-000Du9-7E; Wed, 17 Aug 2011 23:21:43 +0300 Message-ID: <4E4C22D6.6070407@FreeBSD.org> Date: Wed, 17 Aug 2011 23:21:42 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0) Gecko/20110817 Thunderbird/6.0 MIME-Version: 1.0 To: freebsd-jail@FreeBSD.org, freebsd-hackers@FreeBSD.org References: <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk><4E4380C0.7070908@FreeBSD.org> <4E43E272.1060204@FreeBSD.org> <62BF25D0ED914876BEE75E2ADF28DDF7@multiplay.co.uk> <4E440865.1040500@FreeBSD.org> <6F08A8DE780545ADB9FA93B0A8AA4DA1@multiplay.co.uk> <4E441314.6060606@FreeBSD.org> <2C4B0D05C8924F24A73B56EA652FA4B0@multiplay.co.uk> <4E48D967.9060804@FreeBSD.org> <9D034F992B064E8092E5D1D249B3E959@multiplay.co.uk> <4E490DAF.1080009@FreeBSD.org> <796FD5A096DE4558B57338A8FA1E125B@multiplay.co.uk> <4E491D01.1090902@FreeBSD.org> <570C5495A5E242F7946E806CA7AC5D68@multiplay.co.uk> <4E4AD35C.7020504@FreeBSD.org> <6A7238AED44542A880B082A40304D940@multiplay.co.uk> <4E4BA21F.6010805@FreeBSD.org> <581C95046B0948FC82D6F2E86948F87B@multiplay.co.uk> <4E4BBA7F.30907@FreeBSD.org> <88A6CE3E8B174E0694A3A9A5283479B4@multiplay.co.uk> In-Reply-To: <88A6CE3E8B174E0694A3A9A5283479B4@multiplay.co.uk> X-Enigmail-Version: 1.2.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@FreeBSD.org, Steven Hartland Subject: Re: debugging frequent kernel panics on 8.2-RELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2011 20:21:48 -0000 Thanks to the debug that Steven provided and to the help that I received from Kostik, I think that now I understand the basic mechanics of this panic, but, unfortunately, not the details of its root cause. It seems like everything starts with some kind of a race between terminating processes in a jail and termination of the jail itself. This is where the details are very thin so far. What we see is that a process (http) is in exit(2) syscall, in exit1() function actually, and past the place where P_WEXIT flag is set and even past the place where p_limit is freed and reset to NULL. At that place the thread calls prison_proc_free(), which calls prison_deref(). Then, we see that in prison_deref() the thread gets a page fault because of what seems like a NULL pointer dereference. That's just the start of the problem and its root cause. Then, trap_pfault() gets invoked and, because addresses close to NULL look like userspace addresses, vm_fault/vm_fault_hold gets called, which in its turn goes on to call vm_map_growstack. First thing that vm_map_growstack does is a call to lim_cur(), but because p_limit is already NULL, that call results in a NULL pointer dereference and a page fault. Goto the beginning of this paragraph. So we get this recursion of sorts, which only ends when a stack is exhausted and a CPU generates a double-fault. So, of course, Steven is interested in finding and fixing the root cause. I hope we will get to that with some help from the "prison guards" :-) But I also would like to use this opportunity to discuss how we can make it easier to debug such issue as this. I think that this problem demonstrates that when we treat certain junk in kernel address value as a userland address value, we throw additional heaps of irrelevant stuff on top of an actual problem. One solution could be to use a special flag that would mark all actual attempts to access userland address (e.g. setting the flag on entrance to copyin and clearing it upon return), so that in the page fault handler we could distinguish actual faults on userland addresses from faults on garbage kernel addresses. I am sure that there could be other clever techniques to catch such garbage addresses early. -- Andriy Gapon