From owner-freebsd-current@FreeBSD.ORG Tue Dec 20 14:52:50 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1A95106564A; Tue, 20 Dec 2011 14:52:50 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 851828FC14; Tue, 20 Dec 2011 14:52:50 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [96.47.65.170]) by cyrus.watson.org (Postfix) with ESMTPSA id 3C0EA46B2A; Tue, 20 Dec 2011 09:52:50 -0500 (EST) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 7D03FB968; Tue, 20 Dec 2011 09:52:49 -0500 (EST) From: John Baldwin To: mdf@freebsd.org Date: Tue, 20 Dec 2011 09:32:21 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p8; KDE/4.5.5; amd64; ; ) References: <4EED2F1C.2060409@zedat.fu-berlin.de> <201112200852.23300.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201112200932.21223.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 20 Dec 2011 09:52:49 -0500 (EST) Cc: Robert Watson , freebsd-current@freebsd.org, "O. Hartmann" Subject: Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Dec 2011 14:52:50 -0000 On Tuesday, December 20, 2011 9:22:48 am mdf@freebsd.org wrote: > On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin wrote: > > On Saturday, December 17, 2011 10:41:15 pm mdf@freebsd.org wrote: > >> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev wrote: > >> > On Sun, 18 Dec 2011 01:09:00 +0100 > >> > "O. Hartmann" wrote: > >> > > >> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock > >> >> panic: sleeping thread > >> >> cpuid = 0 > >> >> > >> >> PID 16 is always USB on my box. > >> > > >> > You really need to give us a backtrace when you quote panics. It is > >> > impossible to make any sense of the above panic message without more > >> > context. > >> > >> In the case of this panic, the stack of the thread which panics is > >> useless; it's someone trying to propagate priority that discovered it. > >> A backtrace on tid 100033 would be useful. > >> > >> With WITNESS enabled, it's possible to have this panic display the > >> stack of the incorrectly sleeping thread at the time it acquired the > >> lock, as well, but this code isn't in CURRENT or any release. I have > >> a patch at $WORK I can dig up on Monday. > > > > Huh? The stock kernel dumps a stack trace of the offending thread if you have > > DDB enabled: > > > > /* > > * If the thread is asleep, then we are probably about > > * to deadlock. To make debugging this easier, just > > * panic and tell the user which thread misbehaved so > > * they can hopefully get a stack trace from the truly > > * misbehaving thread. > > */ > > if (TD_IS_SLEEPING(td)) { > > printf( > > "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n", > > td->td_tid, td->td_proc->p_pid); > > #ifdef DDB > > db_trace_thread(td, -1); > > #endif > > panic("sleeping thread"); > > } > > Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we > added code to print *which* lock it holds (using WITNESS data). I do > recall that this panic alone was often not sufficient to debug the > problem. I think the db_trace_thread() has been around for a while (since 5 or 6), but it is true that we don't tell you which lock is held even with this. That might be a useful thing to output before the panic. -- John Baldwin