From owner-freebsd-current@FreeBSD.ORG Wed Jul 7 12:15:10 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3EC11106566C; Wed, 7 Jul 2010 12:15:10 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: from mail-ww0-f42.google.com (mail-ww0-f42.google.com [74.125.82.42]) by mx1.freebsd.org (Postfix) with ESMTP id AC4078FC20; Wed, 7 Jul 2010 12:15:09 +0000 (UTC) Received: by wwb13 with SMTP id 13so1337489wwb.1 for ; Wed, 07 Jul 2010 05:15:02 -0700 (PDT) Received: by 10.227.146.76 with SMTP id g12mr1400451wbv.82.1278504902189; Wed, 07 Jul 2010 05:15:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.4.83 with HTTP; Wed, 7 Jul 2010 05:14:42 -0700 (PDT) In-Reply-To: References: <744734406.21.1277969273426.JavaMail.root@sage.daemoninthecloset.org> <269478215.24.1277969553870.JavaMail.root@sage.daemoninthecloset.org> From: =?UTF-8?Q?Marius_N=C3=BCnnerich?= Date: Wed, 7 Jul 2010 14:14:42 +0200 Message-ID: To: Attilio Rao Content-Type: text/plain; charset=UTF-8 Cc: freebsd-current@freebsd.org, Bryan Venteicher Subject: Re: deadlkres() panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2010 12:15:10 -0000 On Wed, Jul 7, 2010 at 14:01, Attilio Rao wrote: > 2010/7/1 Bryan Venteicher : >> On a recent -current, I got the following panic from deadlkres: >> >> Assertion wchan != NULL failed at /usr/src-nfs/sys/kern/subr_sleepqueue.c:680 >> >> Tracing pid 0 tid 100058 td 0xffffff00024bf7a0 >> kdb_enter() at kdb_enter+0x3d >> panic() at panic+0x176 >> sleepq_type() at sleepq_type+0x56 >> deadlkres() at deadlkres+0x224 >> fork_exit() at fork_exit+0x12a >> fork_trampoline() at fork_trampoline+0xe >> --- trap 0, rip = 0, rsp = 0xffffff8074976d30, rbp = 0 --- >> (Hand transcribed, doadump() hung) >> >> deadlkres() came across a TD_IS_SLEEPING()'ing thread that was not a >> sleepqueue (ie, td->td_wchan == NULL). >> >> I don't think this is an invalid state for thread to be in: After adding itself >> to a sleepq and setting a timeout, the thread calls sleepq_timedwait_sig(). >> sleepq_catch_signals() determines there is a signal pending so it removes the >> thread from the sleepq via sleepq_resume_thread(). Returning to >> sleepq_timedwait_sig(), in the call to sleepq_check_timeout(), the thread is >> unable to cancel the timeout because it is already firing (likely waiting on >> thread_lock()). So the thread calls TD_SET_SLEEPING() followed by mi_switch(). >> deadlkres() then picks up thread_lock(), finding td is TD_IS_SLEEPING() && >> !TD_ON_SLEEPQ(). >> >> The attached patch takes care of the panic for me. > > I think that your analysis and patch are both fine and are committed, > along with a small cleanup, as r209761. Thank you both, I guess a had that panic a few days ago. Updating right now.