From owner-freebsd-current@FreeBSD.ORG Fri Jun 6 14:12:20 2014 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from hub.FreeBSD.org (hub.freebsd.org [IPv6:2001:1900:2254:206c::16:88]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B8BC9CB8; Fri, 6 Jun 2014 14:12:19 +0000 (UTC) Date: Fri, 6 Jun 2014 10:12:15 -0400 From: Glen Barber To: freebsd-current@FreeBSD.org Subject: panic in deadlkres() on r267110 Message-ID: <20140606141215.GE33882@hub.FreeBSD.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="M/SuVGWktc5uNpra" Content-Disposition: inline X-Operating-System: FreeBSD 11.0-CURRENT amd64 X-SCUD-Definition: Sudden Completely Unexpected Dataloss X-SULE-Definition: Sudden Unexpected Learning Event User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jun 2014 14:12:20 -0000 --M/SuVGWktc5uNpra Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Two machines in the cluster panic last night with the same backtrace. It is unclear yet exactly what was happening on the systems, but both are port building machines using ports-mgmt/tinderbox. Any ideas or information on how to further debug this would be appreciated. Script started on Fri Jun 6 14:01:53 2014 command: /bin/sh # uname -a FreeBSD redbuild04.nyi.freebsd.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r26= 7110: Thu Jun 5 15:57:43 UTC 2014 sbruno@redbuild04.nyi.freebsd.org:/u= sr/obj/usr/src/sys/REDBUILD amd64 # kgdb ./kernel.debug /var/crash/vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain condition= s. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: deadlkres: possible deadlock detected on allproc_lock cpuid =3D 17 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe1838702= a20 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe1838702ad0 panic() at panic+0x155/frame 0xfffffe1838702b50 deadlkres() at deadlkres+0x42a/frame 0xfffffe1838702bb0 fork_exit() at fork_exit+0x9a/frame 0xfffffe1838702bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe1838702bf0 --- trap 0, rip =3D 0, rsp =3D 0xfffffe1838702cb0, rbp =3D 0 --- KDB: enter: panic Reading symbols from /boot/kernel/zfs.ko.symbols...done. Loaded symbols for /boot/kernel/zfs.ko.symbols Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. Loaded symbols for /boot/kernel/opensolaris.ko.symbols Reading symbols from /boot/kernel/ums.ko.symbols...done. Loaded symbols for /boot/kernel/ums.ko.symbols Reading symbols from /boot/kernel/linprocfs.ko.symbols...done. Loaded symbols for /boot/kernel/linprocfs.ko.symbols Reading symbols from /boot/kernel/linux.ko.symbols...done. Loaded symbols for /boot/kernel/linux.ko.symbols #0 doadump (textdump=3D-946873840) at pcpu.h:219 219 __asm("movq %%gs:%1,%0" : "=3Dr" (td) (kgdb) bt #0 doadump (textdump=3D-946873840) at pcpu.h:219 #1 0xffffffff8034e865 in db_fncall (dummy1=3D,=20 dummy2=3D, dummy3=3D,=20 dummy4=3D) at /usr/src/sys/ddb/db_command.c:578 #2 0xffffffff8034e54d in db_command (cmd_table=3D0x0) at /usr/src/sys/ddb/db_command.c:449 #3 0xffffffff8034e2c4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0xffffffff80350d20 in db_trap (type=3D, code=3D0) at /usr/src/sys/ddb/db_main.c:231 #5 0xffffffff809a9bd9 in kdb_trap (type=3D3, code=3D0, tf=3D) at /usr/src/sys/kern/subr_kdb.c:656 #6 0xffffffff80dc00e3 in trap (frame=3D0xfffffe1838702a00) at /usr/src/sys/amd64/amd64/trap.c:551 #7 0xffffffff80da29c2 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #8 0xffffffff809a933e in kdb_enter (why=3D0xffffffff81039a72 "panic",=20 msg=3D) at cpufunc.h:63 #9 0xffffffff8096a8b5 in panic (fmt=3D) at /usr/src/sys/kern/kern_shutdown.c:749 #10 0xffffffff8090d16a in deadlkres () at /usr/src/sys/kern/kern_clock.c:203 #11 0xffffffff8093170a in fork_exit (callout=3D0xffffffff8090cd40 ,=20 arg=3D0x0, frame=3D0xfffffe1838702c00) at /usr/src/sys/kern/kern_fork.c= :977 ---Type to continue, or q to quit--- #12 0xffffffff80da2efe in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:605 #13 0x0000000000000000 in ?? () Current language: auto; currently minimal (kgdb) fr 10 #10 0xffffffff8090d16a in deadlkres () at /usr/src/sys/kern/kern_clock.c:203 203 panic("%s: possible deadlock detected on allproc_lo= ck\n", (kgdb) l 198 * priority inversion problem leading to starvation. 199 * If the lock can't be held after 100 tries, panic. 200 */ 201 if (!sx_try_slock(&allproc_lock)) { 202 if (tryl > 100) 203 panic("%s: possible deadlock detected on allproc_lo= ck\n", 204 __func__); 205 tryl++; 206 pause("allproc", sleepfreq * hz); 207 continue; (kgdb) up #11 0xffffffff8093170a in fork_exit (callout=3D0xffffffff8090cd40 ,=20 arg=3D0x0, frame=3D0xfffffe1838702c00) at /usr/src/sys/kern/kern_fork.c= :977 977 callout(arg, frame); (kgdb) l 972 * cpu_set_fork_handler intercepts this function call to 973 * have this call a non-return function to stay in kernel m= ode. 974 * initproc has its own fork handler, but it does return. 975 */ 976 KASSERT(callout !=3D NULL, ("NULL callout in fork_exit")); 977 callout(arg, frame); 978 =20 979 /* 980 * Check if a kernel thread misbehaved and returned from it= s main 981 * function. (kgdb) list *0xffffffff8090cd40 0xffffffff8090cd40 is in deadlkres (/usr/src/sys/kern/kern_clock.c:185). 180 static int blktime_threshold =3D 900; 181 static int sleepfreq =3D 3; 182 =20 183 static void 184 deadlkres(void) 185 { 186 struct proc *p; 187 struct thread *td; 188 void *wchan; 189 int blkticks, i, slpticks, slptype, tryl, tticks; (kgdb) quit # ^D Script done on Fri Jun 6 14:03:30 2014 Thanks. Glen --M/SuVGWktc5uNpra Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQIcBAEBCAAGBQJTkcw/AAoJELls3eqvi17QnLgP/AhcO/nPWwSK+qiV3s01eVER /zYGc82KS96d8um4ohjqor1nvAz/epTygUiXd14glWVXXzqGlK6n/nUYPJdJ295E 7RpvLy2dkBQwbi/W8lJeYTYIenUZcGYJTUW01Ec3N0BAlP71czskqBYAKP4SSwkn 1Y9uoT4Q1aYvFrx7zEvX12s+3+x+V7NZEAeD+dPREhWFk2P6oIyD5UwUaNAXlCZE nNvslbSf3IFE8CEQTwLq6GJhmUPyGtPp/EBzUOmsxpgLALmk43I0eeH+54oi2v7Y I4mnKJ4lcWcwFeOqd2McNEu2VvHwdz8lZeL6WAmmBxn3cz/0oxQkG5ccIEGxTdgR 1DMR3rR3ew9vgKOtkHw57aiTUQ1QQlmf6MsuxSBJDBWd8SmrhFfdgpZ9FFqhBzeB AfZOE7O+73dsB66Q3LPv1X+49X6+3Y7dQa5RnIp7urzcxZWGQyRdYKlR2QWbWY9R aWM+GezauFDFH7haQF1mhcH9P27xqt+zQt2NOm3A0nW44oeQ5GZ3CAHd4vW1FjzZ ij/PNrkFy0kjeMM4vC1VSBQI29gJeJ934h+ECOANdppfSqUqapYbZsuxsq+jy+7h 8B+NvedL51OCH/eHsyB3nfXyMF9zWkF/imgxdNsWNQGM+M0cLCBy04KnCT4Ou7lO OO6bf/qigPXUnUX/V8lD =zIMg -----END PGP SIGNATURE----- --M/SuVGWktc5uNpra--