From owner-freebsd-current@FreeBSD.ORG  Mon Aug  4 21:57:34 2008
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0B7D61065682
	for <freebsd-current@freebsd.org>; Mon,  4 Aug 2008 21:57:34 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
	[IPv6:2001:470:1f10:75::2])
	by mx1.freebsd.org (Postfix) with ESMTP id A1C008FC21
	for <freebsd-current@freebsd.org>; Mon,  4 Aug 2008 21:57:33 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1])
	(authenticated bits=0)
	by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m74LvQaM084511;
	Mon, 4 Aug 2008 17:57:27 -0400 (EDT) (envelope-from jhb@freebsd.org)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-current@freebsd.org
Date: Mon, 4 Aug 2008 16:07:55 -0400
User-Agent: KMail/1.9.7
References: <20080730113449.GD407@cdnetworks.co.kr>
	<20080804010205.GA21401@cdnetworks.co.kr>
	<20080804182919.GB1480@roadrunner.spoerlein.net>
In-Reply-To: <20080804182919.GB1480@roadrunner.spoerlein.net>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200808041607.56160.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by
	milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]);
	Mon, 04 Aug 2008 17:57:27 -0400 (EDT)
X-Virus-Scanned: ClamAV 0.93.1/7939/Mon Aug 4 16:09:40 2008 on
	server.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,NO_RELAYS 
	autolearn=ham version=3.1.3
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx
Cc: Pyun YongHyeon <pyunyh@gmail.com>
Subject: Re: Call for bfe(4) testers.
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Aug 2008 21:57:34 -0000

On Monday 04 August 2008 02:29:19 pm Ulrich Spoerlein wrote:
> On Mon, 04.08.2008 at 10:02:05 +0900, Pyun YongHyeon wrote:
> > On Sun, Aug 03, 2008 at 12:56:27PM +0200, Ulrich Spoerlein wrote:
> > > no toe capability on 0xc40abc00
> > > 
> > > messages, but they don't seem the culprit. The stats sysctl also works
> > 
> > I think kmacy@ fixed this. Please update again.
> 
> I will, as I still get the panics with your patches backed out.
> 
> > > Fatal trap 12: page fault while in kernel mode
> > > cpuid = 0; apic id = 00
> > > fault virtual address   = 0x38
> > > fault code              = supervisor read, page not present
> > > instruction pointer     = 0x20:0xc058ec16
> > > stack pointer           = 0x28:0xfb7b6ac8
> > > frame pointer           = 0x28:0xfb7b6ac8
> > > code segment            = base 0x0, limit 0xfffff, type 0x1b
> > >                         = DPL 0, pres 1, def32 1, gran 1
> > > processor eflags        = interrupt enabled, resume, IOPL = 0
> > > current process         = 1327 (powerd)
> > > 
> > 
> > From this and the fault address 0x38 above suggests cpufreq(4)
> > dereferenced a NULL pointer. It seems powered(4) tried to set CPU
> > frequency and encountered page fault. Full backtrace would be
> > great help.
> 
> The kdb.enter.panic script is not called when panicking due to a page
> fault. Knowing this, I do have a backtrace handy:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x38
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0xc058ec16
> stack pointer           = 0x28:0xfb8b8ac8
> frame pointer           = 0x28:0xfb8b8ac8
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 1176 (powerd)
> db:0:kdb.enter.default>  show pcpu
> cpuid        = 0
> curthread    = 0xc4ec0aa0: pid 1176 "powerd"
> curpcb       = 0xfb8b8d90
> fpcurthread  = none
> idlethread   = 0xc3f80cc0: pid 10 "idle: cpu0"
> APIC ID      = 0
> currentldt   = 0x50
> db:0:kdb.enter.default>  bt
> Tracing pid 1176 tid 100103 td 0xc4ec0aa0
> device_is_attached(0,c87e6b40,fb8b8afc,0,101,...) at device_is_attached+0x6
> cf_set_method(c420b600,c87e6b40,64,fb8b8ba4,c87e33b4,...) at 
cf_set_method+0x6a3
> cpufreq_curr_sysctl(c420d840,c4207000,0,fb8b8ba4,fb8b8ba4,...) at 
cpufreq_curr_sysctl+0x232
> sysctl_root(fb8b8ba4,4,1,c4ec0aa0,c4501d38,...) at sysctl_root+0x137
> userland_sysctl(c4ec0aa0,fb8b8c14,4,0,0,...) at userland_sysctl+0x151
> __sysctl(c4ec0aa0,fb8b8cfc,18,fb8b8ca0,46,...) at __sysctl+0xec
> syscall(fb8b8d38) at syscall+0x345
> Xint0x80_syscall() at Xint0x80_syscall+0x20
> --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x28161bd3, esp = 
0xbfbfe8cc, ebp = 0xbfbfe8f8 ---
> db:0:kdb.enter.default>  capture off
> 
> Seems like I caught RELENG_7 during a bad time. Will update again.

What cpufreq drivers do you have loaded and attached?  This patch might work 
around the issue, but I suspect there is a bug in one of the cpufreq drivers.

Index: kern_cpu.c
===================================================================
RCS file: /usr/cvs/src/sys/kern/kern_cpu.c,v
retrieving revision 1.27.2.2
diff -u -r1.27.2.2 kern_cpu.c
--- kern_cpu.c  9 May 2008 19:02:10 -0000       1.27.2.2
+++ kern_cpu.c  4 Aug 2008 20:07:41 -0000
@@ -329,6 +329,8 @@
        /* Next, set any/all relative frequencies via their drivers. */
        for (i = 0; i < level->rel_count; i++) {
                set = &level->rel_set[i];
+               if (set->dev == NULL)
+                       continue;
                if (!device_is_attached(set->dev)) {
                        error = ENXIO;
                        goto out;

-- 
John Baldwin