From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 18 20:43:19 2011 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3C8D2106564A for ; Thu, 18 Aug 2011 20:43:19 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id E3D768FC08 for ; Thu, 18 Aug 2011 20:43:18 +0000 (UTC) Received: by yxn22 with SMTP id 22so981822yxn.13 for ; Thu, 18 Aug 2011 13:43:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=mJIC3dzSZFEfm+d2PPNPw14QLDbIUvhvYDXvpi+2xIs=; b=EEnO+Rzf3v4M5H2oK39bX1ICVx+TjHgVcpuYSLAx7JdLCTGDefHPWI0jZsA+SGiDeR MVd0NLPHbR9Mxtwh0kMu4xwL7Nc80S8M92vIzSpVhjYHmPsI8j/83BEQxriB/Qlw6Df4 sAFhZWoqMqR6hQ2w1rPVGiCptnAIxMw0m1YEA= MIME-Version: 1.0 Received: by 10.236.143.5 with SMTP id k5mr1139332yhj.9.1313698305236; Thu, 18 Aug 2011 13:11:45 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.236.108.33 with HTTP; Thu, 18 Aug 2011 13:11:44 -0700 (PDT) In-Reply-To: <4E4D717F.3090802@FreeBSD.org> References: <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk> <4E4380C0.7070908@FreeBSD.org> <4E43E272.1060204@FreeBSD.org> <62BF25D0ED914876BEE75E2ADF28DDF7@multiplay.co.uk> <4E440865.1040500@FreeBSD.org> <6F08A8DE780545ADB9FA93B0A8AA4DA1@multiplay.co.uk> <4E441314.6060606@FreeBSD.org> <2C4B0D05C8924F24A73B56EA652FA4B0@multiplay.co.uk> <4E48D967.9060804@FreeBSD.org> <9D034F992B064E8092E5D1D249B3E959@multiplay.co.uk> <4E490DAF.1080009@FreeBSD.org> <796FD5A096DE4558B57338A8FA1E125B@multiplay.co.uk> <4E491D01.1090902@FreeBSD.org> <570C5495A5E242F7946E806CA7AC5D68@multiplay.co.uk> <4E4AD35C.7020504@FreeBSD.org> <6A7238AED44542A880B082A40304D940@multiplay.co.uk> <4E4BA21F.6010805@FreeBSD.org> <581C95046B0948FC82D6F2E86948F87B@multiplay.co.uk> <4E4BBA7F.30907@FreeBSD.org> <88A6CE3E8B174E0694A3A9A5283479B4@multiplay.co.uk> <4E4C22D6.6070407@FreeBSD.org> <4E4D717F.3090802@FreeBSD.org> Date: Thu, 18 Aug 2011 22:11:44 +0200 X-Google-Sender-Auth: i75Ofelh7IObcWFwDsnjQQzRudA Message-ID: From: Attilio Rao To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org Subject: Re: debugging frequent kernel panics on 8.2-RELEASE X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Aug 2011 20:43:19 -0000 2011/8/18 Andriy Gapon : > on 17/08/2011 23:21 Andriy Gapon said the following: >> >> It seems like everything starts with some kind of a race between >> terminating >> processes in a jail and termination of the jail itself. =C2=A0This is wh= ere the >> details are very thin so far. =C2=A0What we see is that a process (http)= is in >> exit(2) syscall, in exit1() function actually, and past the place where >> P_WEXIT >> flag is set and even past the place where p_limit is freed and reset to >> NULL. >> At that place the thread calls prison_proc_free(), which calls >> prison_deref(). >> Then, we see that in prison_deref() the thread gets a page fault because >> of what >> seems like a NULL pointer dereference. =C2=A0That's just the start of th= e >> problem and >> its root cause. >> >> Then, trap_pfault() gets invoked and, because addresses close to NULL lo= ok >> like >> userspace addresses, vm_fault/vm_fault_hold gets called, which in its tu= rn >> goes >> on to call vm_map_growstack. =C2=A0First thing that vm_map_growstack doe= s is a >> call >> to lim_cur(), but because p_limit is already NULL, that call results in = a >> NULL >> pointer dereference and a page fault. =C2=A0Goto the beginning of this >> paragraph. >> >> So we get this recursion of sorts, which only ends when a stack is >> exhausted and >> a CPU generates a double-fault. > > BTW, does anyone has an idea why the thread in question would "disappear" > from > the kgdb's point of view? > > (kgdb) p cpuid_to_pcpu[2]->pc_curthread->td_tid > $3 =3D 102057 > (kgdb) tid 102057 > invalid tid > > info threads also doesn't list the thread. > > Is it because the panic happened while the thread was somewhere in exit1(= )? > is there an easy way to examine its stack in this case? Yes it is likely it. 'tid' command should lookup the tid_to_thread() table (or similar name) which returns NULL, which means the thread has past beyond the point it was in the lookup table. Attilio --=20 Peace can only be achieved by understanding - A. Einstein