Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Dec 2011 06:22:48 -0800
From:      mdf@FreeBSD.org
To:        John Baldwin <jhb@freebsd.org>
Cc:        Robert Watson <rwatson@freebsd.org>, freebsd-current@freebsd.org, "O. Hartmann" <ohartman@zedat.fu-berlin.de>
Subject:   Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662
Message-ID:  <CAMBSHm_ZcMe2uC6HXL9vazYOxVSVVKJqmfHCHXRta8rgdda65w@mail.gmail.com>
In-Reply-To: <201112200852.23300.jhb@freebsd.org>
References:  <4EED2F1C.2060409@zedat.fu-berlin.de> <20111217204514.2fa77ea2@kan.dyndns.org> <CAMBSHm_MHAhTMafuHkMh_CAdOcU4zgJUgbzTNhLvajDFSp45UA@mail.gmail.com> <201112200852.23300.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin <jhb@freebsd.org> wrote:
> On Saturday, December 17, 2011 10:41:15 pm mdf@freebsd.org wrote:
>> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev <kabaev@gmail.com> wro=
te:
>> > On Sun, 18 Dec 2011 01:09:00 +0100
>> > "O. Hartmann" <ohartman@zedat.fu-berlin.de> wrote:
>> >
>> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> >> panic: sleeping thread
>> >> cpuid =3D 0
>> >>
>> >> PID 16 is always USB on my box.
>> >
>> > You really need to give us a backtrace when you quote panics. It is
>> > impossible to make any sense of the above panic message without more
>> > context.
>>
>> In the case of this panic, the stack of the thread which panics is
>> useless; it's someone trying to propagate priority that discovered it.
>> =A0A backtrace on tid 100033 would be useful.
>>
>> With WITNESS enabled, it's possible to have this panic display the
>> stack of the incorrectly sleeping thread at the time it acquired the
>> lock, as well, but this code isn't in CURRENT or any release. =A0I have
>> a patch at $WORK I can dig up on Monday.
>
> Huh? =A0The stock kernel dumps a stack trace of the offending thread if y=
ou have
> DDB enabled:
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/*
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * If the thread is asleep, then we are pr=
obably about
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * to deadlock. =A0To make debugging this =
easier, just
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * panic and tell the user which thread mi=
sbehaved so
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * they can hopefully get a stack trace fr=
om the truly
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * misbehaving thread.
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (TD_IS_SLEEPING(td)) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printf(
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"Sleeping thread (tid %d, pid %d) owns a n=
on-sleepable lock\n",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0td->td_tid, td->td=
_proc->p_pid);
> #ifdef DDB
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0db_trace_thread(td, -1);
> #endif
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("sleeping thread");
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}

Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we
added code to print *which* lock it holds (using WITNESS data).  I do
recall that this panic alone was often not sufficient to debug the
problem.

Thanks,
matthew


> It may be that we can make use of the STACK API here instead to output th=
is
> trace even when DDB isn't enabled. =A0The patch below tries to do that
> (untested). =A0It does some odd thigns though since it is effectively run=
ning
> from a panic context already, so it uses a statically allocated 'struct s=
tack'
> rather than using stack_create() and uses stack_print_ddb() since it is
> holding spin locks and can't possibly grab an sx lock:
>
> Index: subr_turnstile.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- subr_turnstile.c =A0 =A0(revision 228534)
> +++ subr_turnstile.c =A0 =A0(working copy)
> @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
> =A0#include <sys/proc.h>
> =A0#include <sys/queue.h>
> =A0#include <sys/sched.h>
> +#include <sys/stack.h>
> =A0#include <sys/sysctl.h>
> =A0#include <sys/turnstile.h>
>
> @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size);
> =A0static void
> =A0propagate_priority(struct thread *td)
> =A0{
> + =A0 =A0 =A0 static struct stack st;
> =A0 =A0 =A0 =A0struct turnstile *ts;
> =A0 =A0 =A0 =A0int pri;
>
> @@ -217,8 +219,10 @@ propagate_priority(struct thread *td)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printf(
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"Sleeping thread (tid %d, pid %d) owns a n=
on-sleepable lock\n",
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0td->td_tid, td->td=
_proc->p_pid);
> -#ifdef DDB
> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 db_trace_thread(td, -1);
> +#ifdef STACK
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 stack_zero(&st);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 stack_save_td(&st, td);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 stack_print_ddb(&st);
> =A0#endif
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("sleeping thread");
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>
> --
> John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMBSHm_ZcMe2uC6HXL9vazYOxVSVVKJqmfHCHXRta8rgdda65w>