From owner-freebsd-stable@FreeBSD.ORG  Thu Jul 24 16:24:54 2008
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 34B391065678
	for <freebsd-stable@freebsd.org>; Thu, 24 Jul 2008 16:24:54 +0000 (UTC)
	(envelope-from kris@FreeBSD.org)
Received: from weak.local (freefall.freebsd.org [IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 594E98FC17;
	Thu, 24 Jul 2008 16:24:53 +0000 (UTC)
	(envelope-from kris@FreeBSD.org)
Message-ID: <4888ACD7.6010803@FreeBSD.org>
Date: Thu, 24 Jul 2008 18:24:55 +0200
From: Kris Kennaway <kris@FreeBSD.org>
User-Agent: Thunderbird 2.0.0.16 (Macintosh/20080707)
MIME-Version: 1.0
To: John Sullivan <john@basicnets.co.uk>
References: <854CADB9D95147CAB10BC35887A8E5DC@emea.hubersuhner.net><20080716031640.7DC744500E@ptavv.es.net><A6F1ACCEE35A4BC49FC9DFA561ED1131@emea.hubersuhner.net><62b856460807160743v3fce951eg1b2bd9e50a35ba1d@mail.gmail.com><BF6724CD748744908D602889CCF119F1@emea.hubersuhner.net><487E0D1B.2060902@FreeBSD.org>
	<20080716203900.5jt4qce17gg0og0o@mail.basicnets.co.uk>
	<A403B8D27BE048E79A94B09C0C520854@emea.hubersuhner.net>
In-Reply-To: <A403B8D27BE048E79A94B09C0C520854@emea.hubersuhner.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-stable@freebsd.org
Subject: Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Jul 2008 16:24:54 -0000

John Sullivan wrote:
>  
>>> Removing KDB_UNATTENDED from your kernel will allow you 
>>> to interact with the debugger and obtain backtraces etc, 
>>> which is useful when dumps are not being saved.
>> Easier said than done, this cause a few panics - no dumps 
>> though ...grrrr!!
>>
>> Still the same result ... the system seems to panic twice 
>> then hang.  I will keep trying unless you have some other ideas??
> 
> Right, after trying for a number of days the system still just hung without letting me get either a dump or to interactively debug
> in the failed state, I reverted back to the Generic kernel, removed half the memory (2 of the 4 1GB sticks) and the system became
> stable.  I inserted 1 of the 2 removed sticks and all was fine.  I swapped that stick with the remaining stick and all was fine.  I
> put them both back in and I started to see the crashes again - the first of which, gave me this dump -->
> 
> server251# kgdb /boot/kernel/kernel /var/crash/vmcore.1
> [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd".
> 
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 01
> fault virtual address    = 0xb0
> fault code        = supervisor read data, page not present
> instruction pointer    = 0x8:0xffffffff8068d4bd
> stack pointer            = 0x10:0xffffffffb20738e0
> frame pointer            = 0x10:0x0
> code segment        = base 0x0, limit 0xfffff, type 0x1b
>             = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags    = interrupt enabled, resume, IOPL = 0
> current process        = 72836 (objdump)
> trap number        = 12
> panic: page fault
> cpuid = 1
> Uptime: 28m4s
> Physical memory: 4082 MB
> Dumping 518 MB: 503 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39
> 23 7
> 
> #0  doadump () at pcpu.h:194
> 194    pcpu.h: No such file or directory.
>     in pcpu.h
> (kgdb) backtrace
> #0  doadump () at pcpu.h:194
> #1  0x0000000000000004 in ?? ()
> #2  0xffffffff80477699 in boot (howto=260)
>     at /usr/src/sys/kern/kern_shutdown.c:409
> #3  0xffffffff80477a9d in panic (fmt=0x104 <Address 0x104 out of bounds>)
>     at /usr/src/sys/kern/kern_shutdown.c:563
> #4  0xffffffff8072ed44 in trap_fatal (frame=0xffffff003c39c000, 
>     eva=18446742974629017808) at /usr/src/sys/amd64/amd64/trap.c:724
> #5  0xffffffff8072f115 in trap_pfault (frame=0xffffffffb2073830, usermode=0)
>     at /usr/src/sys/amd64/amd64/trap.c:641
> #6  0xffffffff8072fa58 in trap (frame=0xffffffffb2073830)
>     at /usr/src/sys/amd64/amd64/trap.c:410
> #7  0xffffffff807156be in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:169
> #8  0xffffffff8068d4bd in vm_page_cache_remove (m=0xffffff00da9ec3b8)
>     at /usr/src/sys/vm/vm_page.c:896
> #9  0xffffffff8068e1b5 in vm_page_alloc (object=0xffffff00374ffc30, pindex=14, 
>     req=64) at /usr/src/sys/vm/vm_page.c:1080
> #10 0xffffffff8067fa77 in vm_fault (map=0xffffff0005f23d00, vaddr=34365804544, 
>     fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:432
> #11 0xffffffff8072efaf in trap_pfault (frame=0xffffffffb2073c70, usermode=1)
>     at /usr/src/sys/amd64/amd64/trap.c:618
> #12 0xffffffff8072fbf8 in trap (frame=0xffffffffb2073c70)
>     at /usr/src/sys/amd64/amd64/trap.c:309
> #13 0xffffffff807156be in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:169
> #14 0x000000080059c54f in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> 
> So to answer your question are the backtraces always the same, no, they are not.  But I am still confused as to what this means??
> 
> I would appreciate any further insight anyone can give.

That's another corrupted backtrace that doesn't point to an actual 
software problem.  Still sounds like bad RAM, or bad hardware.

Kris