From owner-freebsd-hackers@FreeBSD.ORG Tue Jul 16 15:33:37 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 10D1CE55 for ; Tue, 16 Jul 2013 15:33:37 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id AFC28E92 for ; Tue, 16 Jul 2013 15:33:36 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 11BC4B964; Tue, 16 Jul 2013 11:33:36 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Subject: Re: Kernel crashes after sleep: how to debug? Date: Tue, 16 Jul 2013 11:07:37 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <51E3A334.8020203@rawbw.com> In-Reply-To: <51E3A334.8020203@rawbw.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201307161107.37460.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 16 Jul 2013 11:33:36 -0400 (EDT) Cc: Yuri X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jul 2013 15:33:37 -0000 On Monday, July 15, 2013 3:22:28 am Yuri wrote: > > After sleep/wakeup cycle my 9.1-STABLE r253105 amd64 system has a > tendency to sometimes randomly crash after a while. It doesn't happen > every time. > See kgdb log below. I am not sure there is enough information to lead to > the cause of the issue. > > It looks like it crashes near the line: > #7 0xffffffff8091a181 in _mtx_trylock (m=0x100000000, opts=0, > file=, line=0) at /usr/src/sys/kern/kern_mutex.c:295 > 295 if (SCHEDULER_STOPPED()) > Current language: auto; currently c > (kgdb) l > 290 uint64_t waittime = 0; > 291 int contested = 0; > 292 #endif > 293 int rval; > 294 > 295 if (SCHEDULER_STOPPED()) > 296 return (1); > 297 > 298 KASSERT(m->mtx_lock != MTX_DESTROYED, > 299 ("mtx_trylock() of destroyed mutex @ %s:%d", file, > line)); > > Current thread was: > * 67 Thread 100064 (PID=5: pagedaemon) doadump (textdump= optimized out>) at pcpu.h:234 > > How to find the cause of the crash? > > Yuri > > > --- kgdb log --- > # kgdb /boot/kernel/kernel vmcore.0 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x100000018 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff8091a181 > stack pointer = 0x28:0xffffff80d51c6ab0 > frame pointer = 0x28:0xffffff80d51c6ad0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 5 (pagedaemon) > trap number = 12 > panic: page fault > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff80968416 at kdb_backtrace+0x66 > #1 0xffffffff8092e43e at panic+0x1ce > #2 0xffffffff80d12940 at trap_fatal+0x290 > #3 0xffffffff80d12ca1 at trap_pfault+0x211 > #4 0xffffffff80d13254 at trap+0x344 > #5 0xffffffff80cfc583 at calltrap+0x8 > #6 0xffffffff80baea78 at vm_pageout+0x998 > #7 0xffffffff808fc10f at fork_exit+0x11f > #8 0xffffffff80cfcaae at fork_trampoline+0xe > Uptime: 2h21m27s > Dumping 407 out of 2919 MB:..4%..12%..24%..32%..44%..52%..63%..71%..83%..91% > > Reading symbols from /boot/modules/cuse4bsd.ko...done. > Loaded symbols for /boot/modules/cuse4bsd.ko > Reading symbols from /boot/kernel/linux.ko...Reading symbols from > /boot/kernel/linux.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/linux.ko > Reading symbols from /usr/local/libexec/linux_adobe/linux_adobe.ko...done. > Loaded symbols for /usr/local/libexec/linux_adobe/linux_adobe.ko > Reading symbols from /boot/kernel/radeon.ko...Reading symbols from > /boot/kernel/radeon.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/radeon.ko > Reading symbols from /boot/kernel/drm.ko...Reading symbols from > /boot/kernel/drm.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/drm.ko > #0 doadump (textdump=) at pcpu.h:234 > 234 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) bt > #0 doadump (textdump=) at pcpu.h:234 > #1 0xffffffff8092df16 in kern_reboot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:449 > #2 0xffffffff8092e417 in panic (fmt=0x1
) at > /usr/src/sys/kern/kern_shutdown.c:637 > #3 0xffffffff80d12940 in trap_fatal (frame=0xc, eva= out>) at /usr/src/sys/amd64/amd64/trap.c:879 > #4 0xffffffff80d12ca1 in trap_pfault (frame=0xffffff80d51c6a00, > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:795 > #5 0xffffffff80d13254 in trap (frame=0xffffff80d51c6a00) at > /usr/src/sys/amd64/amd64/trap.c:463 > #6 0xffffffff80cfc583 in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:232 > #7 0xffffffff8091a181 in _mtx_trylock (m=0x100000000, opts=0, > file=, line=0) at /usr/src/sys/kern/kern_mutex.c:295 > #8 0xffffffff80baea78 in vm_pageout () at /usr/src/sys/vm/vm_pageout.c:829 > #9 0xffffffff808fc10f in fork_exit (callout=0xffffffff80bae0e0 > , arg=0x0, frame=0xffffff80d51c6c40) > at /usr/src/sys/kern/kern_fork.c:988 > #10 0xffffffff80cfcaae in fork_trampoline () at > /usr/src/sys/amd64/amd64/exception.S:606 > #11 0x0000000000000000 in ?? () Can you go to frame 8 and do 'l' in kgdb? -- John Baldwin