From owner-freebsd-current@FreeBSD.ORG Tue Dec 20 16:42:26 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 075AC106566C; Tue, 20 Dec 2011 16:42:26 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id BDA3B8FC0A; Tue, 20 Dec 2011 16:42:25 +0000 (UTC) Received: by dakp5 with SMTP id p5so6259628dak.13 for ; Tue, 20 Dec 2011 08:42:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=mNUudw+a+ceD+iQgLNVYTEoAF3bCCXTnO+q8V2xCjAE=; b=Dby2f5r+hLwVE6zGUmHEey0HFwm8peOE1Y33K3YiWkOxC3jBzAttKXWA64UPisj42r 5vjhyeBs3BhkgNgJycvODH3pMs4PwtGaU6564lmDW8l2Z1b+kKP13sbwHS6LyqH2VMWk Q7Hcp9aGd0Hf8ndB7Trltw2P1QxHyfJbFqjBw= MIME-Version: 1.0 Received: by 10.68.75.163 with SMTP id d3mr4335720pbw.88.1324399345167; Tue, 20 Dec 2011 08:42:25 -0800 (PST) Sender: mdf356@gmail.com Received: by 10.68.55.136 with HTTP; Tue, 20 Dec 2011 08:42:25 -0800 (PST) In-Reply-To: <201112200932.21223.jhb@freebsd.org> References: <4EED2F1C.2060409@zedat.fu-berlin.de> <201112200852.23300.jhb@freebsd.org> <201112200932.21223.jhb@freebsd.org> Date: Tue, 20 Dec 2011 08:42:25 -0800 X-Google-Sender-Auth: 5tWe4cu3Mrgv3ioHTEmlyDYjlSI Message-ID: From: mdf@FreeBSD.org To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Robert Watson , freebsd-current@freebsd.org, "O. Hartmann" Subject: Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Dec 2011 16:42:26 -0000 On Tue, Dec 20, 2011 at 6:32 AM, John Baldwin wrote: > On Tuesday, December 20, 2011 9:22:48 am mdf@freebsd.org wrote: >> On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin wrote: >> > On Saturday, December 17, 2011 10:41:15 pm mdf@freebsd.org wrote: >> >> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev = wrote: >> >> > On Sun, 18 Dec 2011 01:09:00 +0100 >> >> > "O. Hartmann" wrote: >> >> > >> >> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock >> >> >> panic: sleeping thread >> >> >> cpuid =3D 0 >> >> >> >> >> >> PID 16 is always USB on my box. >> >> > >> >> > You really need to give us a backtrace when you quote panics. It is >> >> > impossible to make any sense of the above panic message without mor= e >> >> > context. >> >> >> >> In the case of this panic, the stack of the thread which panics is >> >> useless; it's someone trying to propagate priority that discovered it= . >> >> =A0A backtrace on tid 100033 would be useful. >> >> >> >> With WITNESS enabled, it's possible to have this panic display the >> >> stack of the incorrectly sleeping thread at the time it acquired the >> >> lock, as well, but this code isn't in CURRENT or any release. =A0I ha= ve >> >> a patch at $WORK I can dig up on Monday. >> > >> > Huh? =A0The stock kernel dumps a stack trace of the offending thread i= f you have >> > DDB enabled: >> > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * If the thread is asleep, then we are= probably about >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * to deadlock. =A0To make debugging th= is easier, just >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * panic and tell the user which thread= misbehaved so >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * they can hopefully get a stack trace= from the truly >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * misbehaving thread. >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */ >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (TD_IS_SLEEPING(td)) { >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printf( >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"Sleeping thread (tid %d, pid %d) owns = a non-sleepable lock\n", >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0td->td_tid, td-= >td_proc->p_pid); >> > #ifdef DDB >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0db_trace_thread(td, -1)= ; >> > #endif >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("sleeping thread"= ); >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} >> >> Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we >> added code to print *which* lock it holds (using WITNESS data). =A0I do >> recall that this panic alone was often not sufficient to debug the >> problem. > > I think the db_trace_thread() has been around for a while (since 5 or 6), > but it is true that we don't tell you which lock is held even with this. > That might be a useful thing to output before the panic. This patch isn't quite right since I had to hand-edit it. There's a small chance I can commit this in the near future, but of someone else wants to take it, feel free. Style isn't yet fixed up to be FreeBSD standard either. --- /data/sb/bsd.git/sys/kern/subr_turnstile.c 2011-12-12 10:23:12.542196632 -0800 +++ kern/subr_turnstile.c 2011-12-09 10:59:29.882643558 -0800 @@ -165,10 +165,43 @@ static void turnstile_dtor(void *mem, int size, void *arg); #endif static int turnstile_init(void *mem, int size, int flags); static void turnstile_fini(void *mem, int size); +#ifdef INVARIANTS +static void +sleeping_thread_owns_a_nonsleepable_lock(struct thread *td) +{ + printf("Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n", + td->td_tid, td->td_proc->p_pid); +#ifdef DDB + db_trace_thread(td, -1); +#endif +#ifdef WITNESS + struct lock_list_entry *lock_list, *lle; + int i; + + lock_list =3D td->td_sleeplocks; + if (lock_list =3D=3D NULL || lock_list->ll_count =3D=3D 0) { + printf("Thread does not appear to hold any mutexes!\n"); + return; + } + + for (lle =3D lock_list; lle !=3D NULL; lle =3D lle->ll_next) { + for (i =3D lle->ll_count - 1; i >=3D 0; i--) { + struct lock_instance *li =3D &lle->ll_children[i]; + + printf("Lock %s acquired at %s:%d\n", + li->li_lock->lo_name, li->li_file, li->li_line); + } + } +#endif /* WITNESS */ +} +#else +#define sleeping_thread_owns_a_nonsleepable_lock(td) do { } while (0) +#endif /* INVARIANTS */ + /* * Walks the chain of turnstiles and their owners to propagate the priorit= y * of the thread being blocked to all the threads holding locks that have = to * release their locks before this thread can run again. */ @@ -210,19 +243,31 @@ * If the thread is asleep, then we are probably about * to deadlock. To make debugging this easier, just * panic and tell the user which thread misbehaved so * they can hopefully get a stack trace from the truly * misbehaving thread. */ if (TD_IS_SLEEPING(td)) { - printf( - "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n", - td->td_tid, td->td_proc->p_pid); -#ifdef DDB - db_trace_thread(td, -1); -#endif - panic("sleeping thread"); + sleeping_thread_owns_a_nonsleepable_lock(td); + panic("sleeping thread %p owns a nonsleepable lock", + td); } /* * If this thread already has higher priority than the * thread that is being blocked, we are finished. Cheers, matthew