From owner-freebsd-current@FreeBSD.ORG Mon Aug 4 21:57:34 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0B7D61065682 for ; Mon, 4 Aug 2008 21:57:34 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id A1C008FC21 for ; Mon, 4 Aug 2008 21:57:33 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [IPv6:::1]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m74LvQaM084511; Mon, 4 Aug 2008 17:57:27 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-current@freebsd.org Date: Mon, 4 Aug 2008 16:07:55 -0400 User-Agent: KMail/1.9.7 References: <20080730113449.GD407@cdnetworks.co.kr> <20080804010205.GA21401@cdnetworks.co.kr> <20080804182919.GB1480@roadrunner.spoerlein.net> In-Reply-To: <20080804182919.GB1480@roadrunner.spoerlein.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200808041607.56160.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:::1]); Mon, 04 Aug 2008 17:57:27 -0400 (EDT) X-Virus-Scanned: ClamAV 0.93.1/7939/Mon Aug 4 16:09:40 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Pyun YongHyeon Subject: Re: Call for bfe(4) testers. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Aug 2008 21:57:34 -0000 On Monday 04 August 2008 02:29:19 pm Ulrich Spoerlein wrote: > On Mon, 04.08.2008 at 10:02:05 +0900, Pyun YongHyeon wrote: > > On Sun, Aug 03, 2008 at 12:56:27PM +0200, Ulrich Spoerlein wrote: > > > no toe capability on 0xc40abc00 > > > > > > messages, but they don't seem the culprit. The stats sysctl also works > > > > I think kmacy@ fixed this. Please update again. > > I will, as I still get the panics with your patches backed out. > > > > Fatal trap 12: page fault while in kernel mode > > > cpuid = 0; apic id = 00 > > > fault virtual address = 0x38 > > > fault code = supervisor read, page not present > > > instruction pointer = 0x20:0xc058ec16 > > > stack pointer = 0x28:0xfb7b6ac8 > > > frame pointer = 0x28:0xfb7b6ac8 > > > code segment = base 0x0, limit 0xfffff, type 0x1b > > > = DPL 0, pres 1, def32 1, gran 1 > > > processor eflags = interrupt enabled, resume, IOPL = 0 > > > current process = 1327 (powerd) > > > > > > > From this and the fault address 0x38 above suggests cpufreq(4) > > dereferenced a NULL pointer. It seems powered(4) tried to set CPU > > frequency and encountered page fault. Full backtrace would be > > great help. > > The kdb.enter.panic script is not called when panicking due to a page > fault. Knowing this, I do have a backtrace handy: > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x38 > fault code = supervisor read, page not present > instruction pointer = 0x20:0xc058ec16 > stack pointer = 0x28:0xfb8b8ac8 > frame pointer = 0x28:0xfb8b8ac8 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 1176 (powerd) > db:0:kdb.enter.default> show pcpu > cpuid = 0 > curthread = 0xc4ec0aa0: pid 1176 "powerd" > curpcb = 0xfb8b8d90 > fpcurthread = none > idlethread = 0xc3f80cc0: pid 10 "idle: cpu0" > APIC ID = 0 > currentldt = 0x50 > db:0:kdb.enter.default> bt > Tracing pid 1176 tid 100103 td 0xc4ec0aa0 > device_is_attached(0,c87e6b40,fb8b8afc,0,101,...) at device_is_attached+0x6 > cf_set_method(c420b600,c87e6b40,64,fb8b8ba4,c87e33b4,...) at cf_set_method+0x6a3 > cpufreq_curr_sysctl(c420d840,c4207000,0,fb8b8ba4,fb8b8ba4,...) at cpufreq_curr_sysctl+0x232 > sysctl_root(fb8b8ba4,4,1,c4ec0aa0,c4501d38,...) at sysctl_root+0x137 > userland_sysctl(c4ec0aa0,fb8b8c14,4,0,0,...) at userland_sysctl+0x151 > __sysctl(c4ec0aa0,fb8b8cfc,18,fb8b8ca0,46,...) at __sysctl+0xec > syscall(fb8b8d38) at syscall+0x345 > Xint0x80_syscall() at Xint0x80_syscall+0x20 > --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x28161bd3, esp = 0xbfbfe8cc, ebp = 0xbfbfe8f8 --- > db:0:kdb.enter.default> capture off > > Seems like I caught RELENG_7 during a bad time. Will update again. What cpufreq drivers do you have loaded and attached? This patch might work around the issue, but I suspect there is a bug in one of the cpufreq drivers. Index: kern_cpu.c =================================================================== RCS file: /usr/cvs/src/sys/kern/kern_cpu.c,v retrieving revision 1.27.2.2 diff -u -r1.27.2.2 kern_cpu.c --- kern_cpu.c 9 May 2008 19:02:10 -0000 1.27.2.2 +++ kern_cpu.c 4 Aug 2008 20:07:41 -0000 @@ -329,6 +329,8 @@ /* Next, set any/all relative frequencies via their drivers. */ for (i = 0; i < level->rel_count; i++) { set = &level->rel_set[i]; + if (set->dev == NULL) + continue; if (!device_is_attached(set->dev)) { error = ENXIO; goto out; -- John Baldwin