Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Dec 2011 21:20:47 +0100
From:      Attilio Rao <attilio@freebsd.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-current@freebsd.org, Konstantin Belousov <kib@freebsd.org>, Andriy Gapon <avg@freebsd.org>
Subject:   Re: Stop scheduler on panic
Message-ID:  <CAJ-FndCsY9VCLaBuG_Ng6M3yHMyKz-3rgd%2ByexdpGL8JBOHhpQ@mail.gmail.com>
In-Reply-To: <4ED91B8D.2080808@FreeBSD.org>
References:  <20111113083215.GV50300@deviant.kiev.zoral.com.ua> <201112011349.50502.jhb@freebsd.org> <4ED7E6B0.30400@FreeBSD.org> <201112011553.34432.jhb@freebsd.org> <4ED7F4BC.3080206@FreeBSD.org> <4ED855E6.20207@FreeBSD.org> <4ED8A306.9020801@FreeBSD.org> <4ED8F1C1.7010206@FreeBSD.org> <CAJ-FndCBXXGG%2BihS_rVfM5TqcopHABg80U0my9PxguYY8QBD=Q@mail.gmail.com> <4ED91B8D.2080808@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
2011/12/2 John Baldwin <jhb@freebsd.org>:
> On 12/2/11 12:18 PM, Attilio Rao wrote:
>>
>> 2011/12/2 John Baldwin<jhb@freebsd.org>:
>>>
>>> On 12/2/11 5:05 AM, Andriy Gapon wrote:
>>>>
>>>>
>>>> on 02/12/2011 06:36 John Baldwin said the following:
>>>>>
>>>>>
>>>>> Ah, ok (I had thought SCHEDULER_STOPPED was going to always be true
>>>>> when
>>>>> kdb was
>>>>> active). =C2=A0But I think these two changes should cover critical_ex=
it()
>>>>> ok.
>>>>>
>>>>
>>>> I attempted to start a discussion about this a few times already :-)
>>>> Should we treat kdb context the same as SCHEDULER_STOPPED context (in
>>>> the
>>>> current definition) ? =C2=A0That is, skip all locks in the same fashio=
n?
>>>> There are pros and contras.
>>>
>>>
>>>
>>> kdb should not block on locks, no. =C2=A0Most debugger commands should =
not go
>>> near locks anyway unless they are intended to carefully modify the
>>> existing
>>> system in a safe manner (such as the 'kill' command which should only b=
e
>>> using try locks and fail if it cannot safely post the signal).
>>
>>
>> The biggest problem to KDB as the same as panic is that doing proper
>> 'continue' is impossible.
>> One of the features of the 'skip-locking' path is that it doesn't take
>> into account fast locking paths, where sometimes the lock can succeed
>> and other fails and you don't know about them. Also the restarted CPUs
>> can find corrupted datas (as they can be arbitrarely updated), I'm
>> sure it is too much panic prone.
>
>
> Yes, my thought is that kdb commands, etc. should be using dedicated
> routines that do not use locks whenever possible. =C2=A0The problem of a =
user
> calling an arbitrary routine is not solvable (so I don't think we should =
try
> to solve that, you use 'call' at your own risk), but built-in commands
> should explicitly either 1) not use locking, or 2) only use try locks and
> fail out cleanly (including dropping any try locks acquired) if a try fai=
ls.
> =C2=A0Now, that's an ideal view, I don't know how close we are to that in
> practice or if it is a realistically attainable goal.

So you are not in favor of giving KDB its own context?
There are some fallbacks (like, for example, bugs involving the
scheduler or switching mechanism but for that we can make a facility
like KDB_LITE if you want to debug a scheduler problem), but in
general that would avoid replicating code to avoid the locking.

If you don't want to give KDB its own context, we should work on a KPI
(or similar) that defines the command to serve as KDB commands, that
tries to keep things under control, etc.

Attilio


--=20
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndCsY9VCLaBuG_Ng6M3yHMyKz-3rgd%2ByexdpGL8JBOHhpQ>