From owner-freebsd-stable@FreeBSD.ORG Mon Apr 23 17:12:31 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7C82D16A400 for ; Mon, 23 Apr 2007 17:12:31 +0000 (UTC) (envelope-from mike@jellydonut.org) Received: from mail2.secureworks.net (mail2.secureworks.net [65.114.32.154]) by mx1.freebsd.org (Postfix) with ESMTP id 4063F13C46E for ; Mon, 23 Apr 2007 17:12:31 +0000 (UTC) (envelope-from mike@jellydonut.org) Received: from localhost (localhost [127.0.0.1]) by mail2.secureworks.net (Postfix) with ESMTP id AC0A3173F3 for ; Mon, 23 Apr 2007 13:12:30 -0400 (EDT) X-Virus-Scanned: amavisd-new at secureworks.net Received: from mail2.secureworks.net ([127.0.0.1]) by localhost (mail2.secureworks.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bX1ru37ZPrZx for ; Mon, 23 Apr 2007 13:12:30 -0400 (EDT) Received: from [192.168.23.35] (mole1.secureworks.net [63.239.86.3]) by mail2.secureworks.net (Postfix) with ESMTP id 77F08173ED for ; Mon, 23 Apr 2007 13:12:30 -0400 (EDT) Message-ID: <462CE8FE.8060807@jellydonut.org> Date: Mon, 23 Apr 2007 13:12:30 -0400 From: Michael Proto User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8.0.10) Gecko/20070306 Thunderbird/1.5.0.10 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <462CA594.5000904@tomjudge.com> <20070423155728.GC1006@xor.obsecurity.org> In-Reply-To: <20070423155728.GC1006@xor.obsecurity.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Apr 2007 17:12:31 -0000 Kris Kennaway wrote: > On Mon, Apr 23, 2007 at 01:24:52PM +0100, Tom Judge wrote: >> Hi, >> >> Recently I have noticed that one of our Dell PE1950's has been crashing >> a lot with the following reason "supervisor read, page not present". >> >> The system runs 6.2 Release under i386. >> >> I have attached 2 back traces, and I still have both cores if any more >> information is required. Any light that can be shed on this problem >> would be greatly appreciated. >> >> Tom >> >> =========== >> >> uname -a >> FreeBSD narthex.mintel.co.uk 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Mon >> Apr 2 20:13:11 BST 2007 >> root@bob.mintel.co.uk:/usr/obj/usr/src/sys/PE1950 i386 >> >> >> ## Core 1 >> >> root@narthex '13:14:47' '/home/london/tj' >>> $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.1 >> [GDB will not be able to debug user-mode threads: >> /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] >> GNU gdb 6.1.1 [FreeBSD] >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain >> conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "i386-marcel-freebsd". >> >> Unread portion of the kernel message buffer: >> >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id = 00 >> fault virtual address = 0x100005c >> fault code = supervisor read, page not present >> instruction pointer = 0x20:0xc05df61f >> stack pointer = 0x28:0xe4f63c30 >> frame pointer = 0x28:0xe4f63c90 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, def32 1, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 12 (swi1: net) >> trap number = 12 >> panic: page fault >> cpuid = 0 >> Uptime: 1h25m33s >> Dumping 2047 MB (2 chunks) >> chunk 0: 1MB (159 pages) ... ok >> chunk 1: 2047MB (523944 pages) 2031 2015 1999 1983 1967 1951 1935 1919 >> 1903 1887 >> <7>arp_rtrequest: bad gateway 172.31.1.1 (!AF_LINK) >> <7>arp_rtrequest: bad gateway 172.31.0.1 (!AF_LINK) > > You might be hitting a bug in an obscure code path because of the > above errors. I'm CC'ing someone who might be able to help. > > Kris > Bear in mind that a recent "urgent" firmware update was released by Dell last week for 1950, 1955, and 2950 systems that is supposed to fix some data-corruption issues related to dual-core processors. I don't know if this problem is a symptom of that, but it strongly suggested to apply the firmware update regardless. -Proto