Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Dec 2011 15:20:09 +0100
From:      Attilio Rao <attilio@freebsd.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        mdf@freebsd.org, "O. Hartmann" <ohartman@zedat.fu-berlin.de>, freebsd-current@freebsd.org, Robert Watson <rwatson@freebsd.org>
Subject:   Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662
Message-ID:  <CAJ-FndB7o_vjbJefz1Bxa%2B=DEVZDxBoGPdKcVr5vNHdu-pFEFA@mail.gmail.com>
In-Reply-To: <201112200852.23300.jhb@freebsd.org>
References:  <4EED2F1C.2060409@zedat.fu-berlin.de> <20111217204514.2fa77ea2@kan.dyndns.org> <CAMBSHm_MHAhTMafuHkMh_CAdOcU4zgJUgbzTNhLvajDFSp45UA@mail.gmail.com> <201112200852.23300.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
2011/12/20 John Baldwin <jhb@freebsd.org>:
> On Saturday, December 17, 2011 10:41:15 pm mdf@freebsd.org wrote:
>> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev <kabaev@gmail.com> wro=
te:
>> > On Sun, 18 Dec 2011 01:09:00 +0100
>> > "O. Hartmann" <ohartman@zedat.fu-berlin.de> wrote:
>> >
>> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> >> panic: sleeping thread
>> >> cpuid =3D 0
>> >>
>> >> PID 16 is always USB on my box.
>> >
>> > You really need to give us a backtrace when you quote panics. It is
>> > impossible to make any sense of the above panic message without more
>> > context.
>>
>> In the case of this panic, the stack of the thread which panics is
>> useless; it's someone trying to propagate priority that discovered it.
>> =C2=A0A backtrace on tid 100033 would be useful.
>>
>> With WITNESS enabled, it's possible to have this panic display the
>> stack of the incorrectly sleeping thread at the time it acquired the
>> lock, as well, but this code isn't in CURRENT or any release. =C2=A0I ha=
ve
>> a patch at $WORK I can dig up on Monday.
>
> Huh? =C2=A0The stock kernel dumps a stack trace of the offending thread i=
f you have
> DDB enabled:
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * If the thread i=
s asleep, then we are probably about
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * to deadlock. =
=C2=A0To make debugging this easier, just
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * panic and tell =
the user which thread misbehaved so
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * they can hopefu=
lly get a stack trace from the truly
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * misbehaving thr=
ead.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (TD_IS_SLEEPING=
(td)) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0printf(
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"Sleeping thread (=
tid %d, pid %d) owns a non-sleepable lock\n",
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0td->td_tid, td->td_proc->p_pid);
> #ifdef DDB
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0db_trace_thread(td, -1);
> #endif
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0panic("sleeping thread");
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>
> It may be that we can make use of the STACK API here instead to output th=
is
> trace even when DDB isn't enabled. =C2=A0The patch below tries to do that
> (untested). =C2=A0It does some odd thigns though since it is effectively =
running
> from a panic context already, so it uses a statically allocated 'struct s=
tack'
> rather than using stack_create() and uses stack_print_ddb() since it is
> holding spin locks and can't possibly grab an sx lock:
>
> Index: subr_turnstile.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- subr_turnstile.c =C2=A0 =C2=A0(revision 228534)
> +++ subr_turnstile.c =C2=A0 =C2=A0(working copy)
> @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
> =C2=A0#include <sys/proc.h>
> =C2=A0#include <sys/queue.h>
> =C2=A0#include <sys/sched.h>
> +#include <sys/stack.h>
> =C2=A0#include <sys/sysctl.h>
> =C2=A0#include <sys/turnstile.h>
>
> @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size);
> =C2=A0static void
> =C2=A0propagate_priority(struct thread *td)
> =C2=A0{
> + =C2=A0 =C2=A0 =C2=A0 static struct stack st;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct turnstile *ts;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0int pri;
>
> @@ -217,8 +219,10 @@ propagate_priority(struct thread *td)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0printf(
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"Sleeping thread (=
tid %d, pid %d) owns a non-sleepable lock\n",
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0td->td_tid, td->td_proc->p_pid);
> -#ifdef DDB
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 db_trace_thread(td, -1);
> +#ifdef STACK
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 stack_zero(&st);
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 stack_save_td(&st, td);
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 stack_print_ddb(&st);
> =C2=A0#endif
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0panic("sleeping thread");
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>
> --

I'm not sure it is a wise idea to trimm the DDB part, because it is
much more common than STACK enabled. Note that stack(9) is working if
you define DDB too, so I'd say to do that for both.
Also, I don't think you need the stack_zero() prior to set it.

As we are here, however, I have a question for Robert here: do you
think we should support the _ddb() variant of options even in the case
DDB is not enabled in the kernel?
Probabilly the way it is nowadays is easier to have stack(9) both
defined for DDB and STACK, but in general I wouldn't advertise that.

Attilio


--=20
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndB7o_vjbJefz1Bxa%2B=DEVZDxBoGPdKcVr5vNHdu-pFEFA>