From owner-freebsd-stable@FreeBSD.ORG  Thu Jul 24 16:16:16 2008
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8ED191065684;
	Thu, 24 Jul 2008 16:16:16 +0000 (UTC)
	(envelope-from john@basicnets.co.uk)
Received: from server252.basicnets.co.uk (server8.basicnets.co.uk [81.6.221.8])
	by mx1.freebsd.org (Postfix) with ESMTP id 3E22D8FC1D;
	Thu, 24 Jul 2008 16:16:16 +0000 (UTC)
	(envelope-from john@basicnets.co.uk)
Received: from [195.224.14.210] (helo=UKBIM1344)
	by server252.basicnets.co.uk with esmtpa (Exim 4.69 (FreeBSD))
	(envelope-from <john@basicnets.co.uk>)
	id 1KM3Tv-0004yy-MU; Thu, 24 Jul 2008 17:15:56 +0100
From: "John Sullivan" <john@basicnets.co.uk>
To: "'Kris Kennaway'" <kris@FreeBSD.org>,
	<freebsd-stable@freebsd.org>
References: <854CADB9D95147CAB10BC35887A8E5DC@emea.hubersuhner.net><20080716031640.7DC744500E@ptavv.es.net><A6F1ACCEE35A4BC49FC9DFA561ED1131@emea.hubersuhner.net><62b856460807160743v3fce951eg1b2bd9e50a35ba1d@mail.gmail.com><BF6724CD748744908D602889CCF119F1@emea.hubersuhner.net><487E0D1B.2060902@FreeBSD.org>
	<20080716203900.5jt4qce17gg0og0o@mail.basicnets.co.uk>
Date: Thu, 24 Jul 2008 17:15:56 +0100
Message-ID: <A403B8D27BE048E79A94B09C0C520854@emea.hubersuhner.net>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Office Outlook 11
In-Reply-To: <20080716203900.5jt4qce17gg0og0o@mail.basicnets.co.uk>
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
Thread-Index: Acjne7VfokRPFkvRRdS4MNKSKMJfmQGK1DXQ
Cc: 
Subject: RE: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Jul 2008 16:16:16 -0000

=20
>> Removing KDB_UNATTENDED from your kernel will allow you=20
>> to interact with the debugger and obtain backtraces etc,=20
>> which is useful when dumps are not being saved.
>=20
> Easier said than done, this cause a few panics - no dumps=20
> though ...grrrr!!
>=20
> Still the same result ... the system seems to panic twice=20
> then hang.=A0 I will keep trying unless you have some other ideas??

Right, after trying for a number of days the system still just hung =
without letting me get either a dump or to interactively debug
in the failed state, I reverted back to the Generic kernel, removed half =
the memory (2 of the 4 1GB sticks) and the system became
stable.  I inserted 1 of the 2 removed sticks and all was fine.  I =
swapped that stick with the remaining stick and all was fine.  I
put them both back in and I started to see the crashes again - the first =
of which, gave me this dump -->

server251# kgdb /boot/kernel/kernel /var/crash/vmcore.1
[GDB will not be able to debug user-mode threads: =
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you =
are
welcome to change it and/or distribute copies of it under certain =
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for =
details.
This GDB was configured as "amd64-marcel-freebsd".

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid =3D 1; apic id =3D 01
fault virtual address    =3D 0xb0
fault code        =3D supervisor read data, page not present
instruction pointer    =3D 0x8:0xffffffff8068d4bd
stack pointer            =3D 0x10:0xffffffffb20738e0
frame pointer            =3D 0x10:0x0
code segment        =3D base 0x0, limit 0xfffff, type 0x1b
            =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    =3D interrupt enabled, resume, IOPL =3D 0
current process        =3D 72836 (objdump)
trap number        =3D 12
panic: page fault
cpuid =3D 1
Uptime: 28m4s
Physical memory: 4082 MB
Dumping 518 MB: 503 487 471 455 439 423 407 391 375 359 343 327 311 295 =
279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39
23 7

#0  doadump () at pcpu.h:194
194    pcpu.h: No such file or directory.
    in pcpu.h
(kgdb) backtrace
#0  doadump () at pcpu.h:194
#1  0x0000000000000004 in ?? ()
#2  0xffffffff80477699 in boot (howto=3D260)
    at /usr/src/sys/kern/kern_shutdown.c:409
#3  0xffffffff80477a9d in panic (fmt=3D0x104 <Address 0x104 out of =
bounds>)
    at /usr/src/sys/kern/kern_shutdown.c:563
#4  0xffffffff8072ed44 in trap_fatal (frame=3D0xffffff003c39c000,=20
    eva=3D18446742974629017808) at /usr/src/sys/amd64/amd64/trap.c:724
#5  0xffffffff8072f115 in trap_pfault (frame=3D0xffffffffb2073830, =
usermode=3D0)
    at /usr/src/sys/amd64/amd64/trap.c:641
#6  0xffffffff8072fa58 in trap (frame=3D0xffffffffb2073830)
    at /usr/src/sys/amd64/amd64/trap.c:410
#7  0xffffffff807156be in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:169
#8  0xffffffff8068d4bd in vm_page_cache_remove (m=3D0xffffff00da9ec3b8)
    at /usr/src/sys/vm/vm_page.c:896
#9  0xffffffff8068e1b5 in vm_page_alloc (object=3D0xffffff00374ffc30, =
pindex=3D14,=20
    req=3D64) at /usr/src/sys/vm/vm_page.c:1080
#10 0xffffffff8067fa77 in vm_fault (map=3D0xffffff0005f23d00, =
vaddr=3D34365804544,=20
    fault_type=3D1 '\001', fault_flags=3D0) at =
/usr/src/sys/vm/vm_fault.c:432
#11 0xffffffff8072efaf in trap_pfault (frame=3D0xffffffffb2073c70, =
usermode=3D1)
    at /usr/src/sys/amd64/amd64/trap.c:618
#12 0xffffffff8072fbf8 in trap (frame=3D0xffffffffb2073c70)
    at /usr/src/sys/amd64/amd64/trap.c:309
#13 0xffffffff807156be in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:169
#14 0x000000080059c54f in ?? ()
Previous frame inner to this frame (corrupt stack?)

So to answer your question are the backtraces always the same, no, they =
are not.  But I am still confused as to what this means??

I would appreciate any further insight anyone can give.

Thanks

John