From owner-freebsd-stable@FreeBSD.ORG Tue Nov 25 11:35:25 2008 Return-Path: Delivered-To: FreeBSD-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 911801065670 for ; Tue, 25 Nov 2008 11:35:25 +0000 (UTC) (envelope-from rorya+freebsd.org@TrueStep.com) Received: from Tserver.TrueStep.com (Tserver.TrueStep.com [64.253.96.188]) by mx1.freebsd.org (Postfix) with ESMTP id 2DEDA8FC19 for ; Tue, 25 Nov 2008 11:35:25 +0000 (UTC) (envelope-from rorya+freebsd.org@TrueStep.com) Received: from Cypher.TrueStep (Cypher.TrueStep [10.101.1.8]) (authenticated bits=0) by Tserver.TrueStep.com (8.14.3/8.14.3) with ESMTP id mAPBZAP0060373 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 25 Nov 2008 06:35:15 -0500 (EST) (envelope-from rorya+freebsd.org@TrueStep.com) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=TrueStep.com; s=default; t=1227612916; bh=o8RpexmzLRgovvnO2u7QgPxOrauXj2d5/O/DX0O +yaA=; h=Cc:Message-Id:From:To:In-Reply-To:Content-Type: Content-Transfer-Encoding:Mime-Version:Subject:Date:References; b=CuvXIAFW+PNuBbyHOxjk1l5ynDymwCykYRWRSLbBG3JtU+LCPJNfAGwP0gnAN2/la nZr6yeBEhmjtoRvWug21aFB7FhdvBH1FkcZFkGdiTtaRQQ7xLkSM7kDhw12XBUjvzgC nQbK0A1p7rjwgzTVSAnHZoNbYUArOdXj72sUWQM= Message-Id: From: Rory Arms To: kensmith Smith In-Reply-To: <5113227.789231227483399867.JavaMail.defaultUser@defaultHost> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Date: Tue, 25 Nov 2008 06:35:10 -0500 References: <5113227.789231227483399867.JavaMail.defaultUser@defaultHost> X-Mailer: Apple Mail (2.929.2) Cc: FreeBSD-stable Subject: Re: R: Re: 6.4-RC2 crashes after a few minutes of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Nov 2008 11:35:25 -0000 Ken, I built a GENERIC debug kernel, and now have a backtrace that I can provide related to this problem on 6.4-RC2: surfer# kgdb /sys/i386/compile/GENERIC/kernel.debug /var/crash/vmcore.1 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: acd0: WARNING - READ_TOC read data overrun 18>12 kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode fault virtual address = 0x78 fault code = supervisor read, page not present instruction pointer = 0x20:0xc06d39b9 stack pointer = 0x28:0xca865c10 frame pointer = 0x28:0xca865c14 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 19 (swi6: task queue) trap number = 12 panic: page fault Uptime: 16m20s Physical memory: 179 MB Dumping 53 MB: 38 22 6 Reading symbols from /boot/kernel/snd_maestro.ko...done. Loaded symbols for /boot/kernel/snd_maestro.ko Reading symbols from /boot/kernel/sound.ko...done. Loaded symbols for /boot/kernel/sound.ko Reading symbols from /boot/kernel/acpi.ko...done. Loaded symbols for /boot/kernel/acpi.ko Reading symbols from /boot/kernel/mach64.ko...done. Loaded symbols for /boot/kernel/mach64.ko Reading symbols from /boot/kernel/drm.ko...done. Loaded symbols for /boot/kernel/drm.ko #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc06b2e3e in boot (howto=260) at ../../../kern/kern_shutdown.c:410 #2 0xc06b30d4 in panic (fmt=0xc098be6b "%s") at ../../../kern/kern_shutdown.c:566 #3 0xc092b1f4 in trap_fatal (frame=0xca865bd0, eva=120) at ../../../i386/i386/trap.c:838 #4 0xc092a992 in trap (frame= {tf_fs = 8, tf_es = -1038352344, tf_ds = -1038352344, tf_edi = -1033627044, tf_esi = -1038289792, tf_ebp = -897164268, tf_isp = -897164292, tf_ebx = -1039268288, tf_edx = 0, tf_ecx = 4, tf_eax = -1038289760, tf_trapno = 12, tf_err = 0, tf_eip = -1066583623, tf_cs = 32, tf_eflags = 589826, tf_esp = -1038289792, tf_ss = -897164232}) at ../../../i386/i386/trap.c:270 #5 0xc0917e2a in calltrap () at ../../../i386/i386/exception.s:139 #6 0xc06d39b9 in turnstile_setowner (ts=0xc20e0640, owner=0x4) at ../../../kern/subr_turnstile.c:456 #7 0xc06d3d16 in turnstile_wait (lock=0xc2641aa8, owner=0x4, queue=0) at ../../../kern/subr_turnstile.c:661 #8 0xc06a9d2a in _mtx_lock_sleep (m=0xc2641aa8, tid=3256677504, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:579 #9 0xc06b2492 in _sema_post (sema=0xc2641aa8, file=0x0, line=0) at ../../../kern/kern_sema.c:79 #10 0xc04e7c26 in ata_completed (context=0xc2641a5c, dummy=1) at ../../../dev/ata/ata-queue.c:481 ---Type to continue, or q to quit--- #11 0xc06d29a3 in taskqueue_run (queue=0xc21c4100) at ../../../kern/subr_taskqueue.c:257 #12 0xc06d2bb6 in taskqueue_swi_run (dummy=0x0) at ../../../kern/subr_taskqueue.c:299 #13 0xc069baad in ithread_execute_handlers (p=0xc21ce860, ie=0xc21c4080) at ../../../kern/kern_intr.c:682 #14 0xc069bbc8 in ithread_loop (arg=0xc214cb60) at ../../../kern/kern_intr.c:766 #15 0xc069aa34 in fork_exit (callout=0xc069bb74 , arg=0xc214cb60, frame=0xca865d38) at ../../../kern/kern_fork.c:788 #16 0xc0917e8c in fork_trampoline () at ../../../i386/i386/exception.s: 208 (kgdb) print panicstr $1 = 0xc0a8d480 "page fault" (kgdb) This panic happened just a few minutes after bootup completed, without logging on. Also, I've noticed that sometimes when the panic happens, savecore(8) seems to be unable to recover the coredump in the swap area. I noticed that on the bootup, the system seems to engage the swap partition well before savecore(8) has a chance to scan it. So, I wondered if it's possible that maybe that when swap is being engaged, it may be writing something to the swap partition, effectively overwriting the signature that savecore(8) checks, to detect the existence of a core dump? To recap (in case this doesn't get attached to the original thread) As I said in the original message, another odd thing about this, is that usually after it crashes on the first (and sometimes second) cold boot, it will remain stable till the machine is shut down. And as I think I also mentioned, once logged into GNOME, I see that a "blank disc" icon flashes off and on on the desktop, as if the system detects the existence of a CD in the drive, so that might be related. Though, this also happened with 6.3 + GNOME, though as I said, the system didn't panic there. Thanks, - rory