Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 03 Jun 2011 10:28:03 -0500
From:      Nathan Whitehorn <nwhitehorn@freebsd.org>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        freebsd-current@FreeBSD.org, freebsd-stable@FreeBSD.org
Subject:   Re: [poll / rfc] kdb_stop_cpus
Message-ID:  <4DE8FD83.6030503@freebsd.org>
In-Reply-To: <4DE8FA2E.4030202@FreeBSD.org>
References:  <4DE8FA2E.4030202@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 06/03/11 10:13, Andriy Gapon wrote:
>
> I wonder if anybody uses kdb_stop_cpus with non-default value.
> If, yes, I am very interested to learn about your usecase for it.
>
> I think that the default kdb behavior is the correct one, so it doesn't make sense
> to have a knob to turn on incorrect behavior.
> But I may be missing something obvious.
>
> The comment in the code doesn't really satisfy me:
> /*
>   * Flag indicating whether or not to IPI the other CPUs to stop them on
>   * entering the debugger.  Sometimes, this will result in a deadlock as
>   * stop_cpus() waits for the other cpus to stop, so we allow it to be
>   * disabled.  In order to maximize the chances of success, use a hard
>   * stop for that.
>   */
>
> The hard stop should be sufficiently mighty.
> Yes, I am aware of supposedly extremely rare situations where a deadlock could
> happen even when using hard stop.  But I'd rather fix that than have this switch.
>
> Oh, the commit message (from 2004) explains it:
>> Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we
>> attempt to IPI other cpus when entering the debugger in order to stop
>> them while in the debugger.  The default remains to issue the stop;
>> however, that can result in a hang if another cpu has interrupts disabled
>> and is spinning, since the IPI won't be received and the KDB will wait
>> indefinitely.  We probably need to add a timeout, but this is a useful
>> stopgap in the mean time.
>
> But that was before we started using hard stop in this context (in 2009).

Some non-x86 platforms (e.g. PPC) don't support real NMIs, and so this 
still applies.
-Nathan




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DE8FD83.6030503>