From owner-freebsd-stable@FreeBSD.ORG  Thu Aug 17 18:25:32 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@freebsd.org
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C79C416A5F9
	for <freebsd-stable@freebsd.org>; Thu, 17 Aug 2006 18:25:32 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net
	[66.23.211.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4F70643F5F
	for <freebsd-stable@freebsd.org>; Thu, 17 Aug 2006 18:24:27 +0000 (GMT)
	(envelope-from jhb@freebsd.org)
Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1])
	(authenticated bits=0)
	by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id k7HIO7xP059754;
	Thu, 17 Aug 2006 14:24:13 -0400 (EDT) (envelope-from jhb@freebsd.org)
From: John Baldwin <jhb@freebsd.org>
To: Peter van Heusden <pvh@wfeet.za.net>
Date: Thu, 17 Aug 2006 13:24:48 -0400
User-Agent: KMail/1.9.1
References: <44DED670.9050601@wfeet.za.net>
	<200608141430.38015.jhb@freebsd.org>
	<44E426E5.2000703@wfeet.za.net>
In-Reply-To: <44E426E5.2000703@wfeet.za.net>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200608171324.48815.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by
	milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]);
	Thu, 17 Aug 2006 14:24:13 -0400 (EDT)
X-Virus-Scanned: ClamAV 0.88.3/1677/Thu Aug 17 09:56:09 2006 on
	server.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 
	autolearn=ham version=3.1.3
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx
Cc: freebsd-stable@freebsd.org
Subject: Re: Unexplained kernel panic on 5-STABLE (now in 6-STABLE)
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Aug 2006 18:25:32 -0000

On Thursday 17 August 2006 04:20, Peter van Heusden wrote:
> Thanks for the advice John. I upgraded to 6-STABLE and just got a kernel
> panic again. Before I list the dump, I'd like to mention two messages I
> see in my syslog. Firstly, often I get something like this:
> 
> kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 151698,
> size: 28672
> 
> (though the last message like this was at 9:18 this morning and the
> kernel paniced at about 10:04)
> 
> and secondly, during boot, I get messages like this:
> 
> kernel: acpi: bad read from port 0xcfc (32)
> kernel: acpi: bad write to port 0xcf8 (32), val 0x80002084
> 
> Anyway, on with the kgdb output:
> 
> leftside# kgdb kernel.debug /var/crash/vmcore.27
> [GDB will not be able to debug user-mode threads:
> /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-marcel-freebsd".
> 
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0x1b
> fault code              = supervisor write, page not present
> instruction pointer     = 0x20:0xc08a98ca
> stack pointer           = 0x28:0xcbdd4cc4
> frame pointer           = 0x28:0xcbdd4ce0
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 34 (pagedaemon)
> trap number             = 12
> panic: page fault
> Uptime: 19h6m20s
> Dumping 251 MB (2 chunks)
>   chunk 0: 1MB (160 pages) ... ok
>   chunk 1: 251MB (64252 pages) 236 220 204 188 172 156 140 124 108 92 76
> 60 44 28 12
> 
> #0  doadump () at pcpu.h:165
> 165             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
> (kgdb) list *0xc08a98ca
> 0xc08a98ca is in pmap_ts_referenced (atomic.h:149).
> 144     static __inline int
> 145     atomic_cmpset_int(volatile u_int *dst, u_int exp, u_int src)
> 146     {
> 147             int res = exp;
> 148    
> 149             __asm __volatile (
> 150             "       " __XSTRING(MPLOCKED) " "
> 151             "       cmpxchgl %2,%1 ;        "
> 152             "       setz    %%al ;          "
> 153             "       movzbl  %%al,%0 ;       "
> (kgdb) backtrace
> #0  doadump () at pcpu.h:165
> #1  0xc069cee6 in boot (howto=260) at
> /usr/freebsd6/src/sys/kern/kern_shutdown.c:409
> #2  0xc069d17c in panic (fmt=0xc0903b6f "%s") at
> /usr/freebsd6/src/sys/kern/kern_shutdown.c:565
> #3  0xc08ac6d4 in trap_fatal (frame=0xcbdd4c84, eva=27) at
> /usr/freebsd6/src/sys/i386/i386/trap.c:836
> #4  0xc08ac43b in trap_pfault (frame=0xcbdd4c84, usermode=0, eva=27) at
> /usr/freebsd6/src/sys/i386/i386/trap.c:744
> #5  0xc08ac079 in trap (frame=
>       {tf_fs = -874708984, tf_es = -1064697816, tf_ds = -1063452632,
> tf_edi = -1054911176, tf_esi = 4, tf_ebp = -874689312, tf_isp =
> -874689360, tf_ebx = -1050447872, tf_edx = -1, tf_ecx = -1038927232,
> tf_eax = 4, tf_trapno = 12, tf_err = 2, tf_eip = -1064658742, tf_cs =
> 32, tf_eflags = 66050, tf_esp = -874689320, tf_ss = 0}) at
> /usr/freebsd6/src/sys/i386/i386/trap.c:434
> #6  0xc089a7fa in calltrap () at
> /usr/freebsd6/src/sys/i386/i386/exception.s:139
> #7  0xc08a98ca in pmap_ts_referenced (m=0xc11f5538) at atomic.h:149

I think for this one you will want to ask alc@ if he has any ideas.

> #8  0xc0815e59 in vm_pageout_page_stats () at
> /usr/freebsd6/src/sys/vm/vm_pageout.c:1401
> #9  0xc0816192 in vm_pageout () at
> /usr/freebsd6/src/sys/vm/vm_pageout.c:1546
> #10 0xc0687434 in fork_exit (callout=0xc0815ef8 <vm_pageout>, arg=0x0,
> frame=0xcbdd4d38) at /usr/freebsd6/src/sys/kern/kern_fork.c:805
> #11 0xc089a85c in fork_trampoline () at
> /usr/freebsd6/src/sys/i386/i386/exception.s:208
> 
> Thanks for all the help,
> Peter
> 
> John Baldwin wrote:
> > On Monday 14 August 2006 14:16, Peter van Heusden wrote:
> >   
> >> Thanks. That gives the following output:
> >>
> >> #9  0xc0801295 in trap_pfault (frame=0xd1231b68, usermode=0x0, eva=0x3)
> >> at /usr/src/sys/i386/i386/trap.c:714
> >> #10 0xc0800fa5 in trap (frame=
> >>       {tf_fs = 0xc1e20018, tf_es = 0x10, tf_ds = 0x10, tf_edi = 0x0,
> >> tf_esi = 0xc1045420, tf_ebp = 0xd1231bb8, tf_isp = 0xd1231b94, tf_ebx =
> >> 0xc1045458, tf_edx = 0xffffffff, tf_ecx = 0xc28cf000, tf_eax =
> >> 0xc1045434, tf_trapno = 0xc, tf_err = 0x2, tf_eip = 0xc07b6bbe, tf_cs =
> >> 0x8, tf_eflags = 0x10286, tf_esp = 0xc102dd08, tf_ss = 0xc1edf570})
> >>     at /usr/src/sys/i386/i386/trap.c:427
> >> #11 0xc07f0eea in calltrap () at /usr/src/sys/i386/i386/exception.s:140
> >> ...
> >> #25 0xc07b6bbe in uma_zalloc_arg (zone=0xc1045420, udata=0x0, flags=0x1)
> >> at /usr/src/sys/vm/uma_core.c:1895
> >> #26 0xc07fc97f in get_pv_entry () at uma.h:276
> >>     
> >
> > So it got a nested page fault inside the VM, basically.
> >
> >   
> >> Does this help? It seems that fork() was called, and then something went
> >> wrong from there. One common feature of these panics seems to be that
> >> they happen when my server (an aging P3 700Mhz with 256 MB of RAM that
> >> is put to use for all sorts of network services) is under quite heavy 
load.
> >>     
> >
> > To be honest, I'd update it to 6.1 or even 6-stable as this is likely 
already 
> > fixed in 6.x.
> >
> >   
> 
> 

-- 
John Baldwin