From owner-freebsd-current@FreeBSD.ORG Fri Jun 6 14:23:52 2014 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 80FA614B; Fri, 6 Jun 2014 14:23:52 +0000 (UTC) Received: from mail.ignoranthack.me (ignoranthack.me [199.102.79.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 613EB2662; Fri, 6 Jun 2014 14:23:51 +0000 (UTC) Received: from [192.168.200.103] (c-50-131-4-11.hsd1.ca.comcast.net [50.131.4.11]) (using SSLv3 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: sbruno@ignoranthack.me) by mail.ignoranthack.me (Postfix) with ESMTPSA id 20225193418; Fri, 6 Jun 2014 14:23:51 +0000 (UTC) Subject: Re: panic in deadlkres() on r267110 From: Sean Bruno Reply-To: sbruno@freebsd.org To: Glen Barber In-Reply-To: <20140606141215.GE33882@hub.FreeBSD.org> References: <20140606141215.GE33882@hub.FreeBSD.org> Content-Type: text/plain; charset="us-ascii" Date: Fri, 06 Jun 2014 07:23:49 -0700 Message-ID: <1402064629.1123.46.camel@bruno> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-current@FreeBSD.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jun 2014 14:23:52 -0000 On Fri, 2014-06-06 at 10:12 -0400, Glen Barber wrote: > Two machines in the cluster panic last night with the same backtrace. > It is unclear yet exactly what was happening on the systems, but both > are port building machines using ports-mgmt/tinderbox. > > Any ideas or information on how to further debug this would be > appreciated. > These machines were happily running r266621 previously to this update yesterday. So, that gives us a bisection point. sean > Script started on Fri Jun 6 14:01:53 2014 > command: /bin/sh > # uname -a > FreeBSD redbuild04.nyi.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r267110: Thu Jun 5 15:57:43 UTC 2014 sbruno@redbuild04.nyi.freebsd.org:/usr/obj/usr/src/sys/REDBUILD amd64 > # kgdb ./kernel.debug /var/crash/vmcore.0 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > panic: deadlkres: possible deadlock detected on allproc_lock > > cpuid = 17 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe1838702a20 > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe1838702ad0 > panic() at panic+0x155/frame 0xfffffe1838702b50 > deadlkres() at deadlkres+0x42a/frame 0xfffffe1838702bb0 > fork_exit() at fork_exit+0x9a/frame 0xfffffe1838702bf0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe1838702bf0 > --- trap 0, rip = 0, rsp = 0xfffffe1838702cb0, rbp = 0 --- > KDB: enter: panic > > Reading symbols from /boot/kernel/zfs.ko.symbols...done. > Loaded symbols for /boot/kernel/zfs.ko.symbols > Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. > Loaded symbols for /boot/kernel/opensolaris.ko.symbols > Reading symbols from /boot/kernel/ums.ko.symbols...done. > Loaded symbols for /boot/kernel/ums.ko.symbols > Reading symbols from /boot/kernel/linprocfs.ko.symbols...done. > Loaded symbols for /boot/kernel/linprocfs.ko.symbols > Reading symbols from /boot/kernel/linux.ko.symbols...done. > Loaded symbols for /boot/kernel/linux.ko.symbols > #0 doadump (textdump=-946873840) at pcpu.h:219 > 219 __asm("movq %%gs:%1,%0" : "=r" (td) > (kgdb) bt > #0 doadump (textdump=-946873840) at pcpu.h:219 > #1 0xffffffff8034e865 in db_fncall (dummy1=, > dummy2=, dummy3=, > dummy4=) at /usr/src/sys/ddb/db_command.c:578 > #2 0xffffffff8034e54d in db_command (cmd_table=0x0) > at /usr/src/sys/ddb/db_command.c:449 > #3 0xffffffff8034e2c4 in db_command_loop () > at /usr/src/sys/ddb/db_command.c:502 > #4 0xffffffff80350d20 in db_trap (type=, code=0) > at /usr/src/sys/ddb/db_main.c:231 > #5 0xffffffff809a9bd9 in kdb_trap (type=3, code=0, tf=) > at /usr/src/sys/kern/subr_kdb.c:656 > #6 0xffffffff80dc00e3 in trap (frame=0xfffffe1838702a00) > at /usr/src/sys/amd64/amd64/trap.c:551 > #7 0xffffffff80da29c2 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:231 > #8 0xffffffff809a933e in kdb_enter (why=0xffffffff81039a72 "panic", > msg=) at cpufunc.h:63 > #9 0xffffffff8096a8b5 in panic (fmt=) > at /usr/src/sys/kern/kern_shutdown.c:749 > #10 0xffffffff8090d16a in deadlkres () at /usr/src/sys/kern/kern_clock.c:203 > #11 0xffffffff8093170a in fork_exit (callout=0xffffffff8090cd40 , > arg=0x0, frame=0xfffffe1838702c00) at /usr/src/sys/kern/kern_fork.c:977 > ---Type to continue, or q to quit--- > #12 0xffffffff80da2efe in fork_trampoline () > at /usr/src/sys/amd64/amd64/exception.S:605 > #13 0x0000000000000000 in ?? () > Current language: auto; currently minimal > (kgdb) fr 10 > #10 0xffffffff8090d16a in deadlkres () at /usr/src/sys/kern/kern_clock.c:203 > 203 panic("%s: possible deadlock detected on allproc_lock\n", > (kgdb) l > 198 * priority inversion problem leading to starvation. > 199 * If the lock can't be held after 100 tries, panic. > 200 */ > 201 if (!sx_try_slock(&allproc_lock)) { > 202 if (tryl > 100) > 203 panic("%s: possible deadlock detected on allproc_lock\n", > 204 __func__); > 205 tryl++; > 206 pause("allproc", sleepfreq * hz); > 207 continue; > (kgdb) up > #11 0xffffffff8093170a in fork_exit (callout=0xffffffff8090cd40 , > arg=0x0, frame=0xfffffe1838702c00) at /usr/src/sys/kern/kern_fork.c:977 > 977 callout(arg, frame); > (kgdb) l > 972 * cpu_set_fork_handler intercepts this function call to > 973 * have this call a non-return function to stay in kernel mode. > 974 * initproc has its own fork handler, but it does return. > 975 */ > 976 KASSERT(callout != NULL, ("NULL callout in fork_exit")); > 977 callout(arg, frame); > 978 > 979 /* > 980 * Check if a kernel thread misbehaved and returned from its main > 981 * function. > (kgdb) list *0xffffffff8090cd40 > 0xffffffff8090cd40 is in deadlkres (/usr/src/sys/kern/kern_clock.c:185). > 180 static int blktime_threshold = 900; > 181 static int sleepfreq = 3; > 182 > 183 static void > 184 deadlkres(void) > 185 { > 186 struct proc *p; > 187 struct thread *td; > 188 void *wchan; > 189 int blkticks, i, slpticks, slptype, tryl, tticks; > (kgdb) quit > # ^D > Script done on Fri Jun 6 14:03:30 2014 > > Thanks. > > Glen >