From owner-freebsd-current@FreeBSD.ORG Sat Jun 4 22:34:00 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D5C781065672; Sat, 4 Jun 2011 22:34:00 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com [209.85.218.54]) by mx1.freebsd.org (Postfix) with ESMTP id 5916D8FC12; Sat, 4 Jun 2011 22:34:00 +0000 (UTC) Received: by yie12 with SMTP id 12so1749246yie.13 for ; Sat, 04 Jun 2011 15:33:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=bVYvCO+7Urv6VxMt/s9qUDeLTZWwuPQrQ/43cfS5keA=; b=dwo0wj30xjiulUsgL9Hd4+ElELA2NSkyclSIwopt2XU7E7y53vwWiyhS4H1NipFPmY dk4XQ1gMrQEEp3VFKfuJR8TZYnrMFYrbUUmTxuErYYWfCm8j1f5z8hDVgHxkSbRba5rU ZCeaHRbrmYuJmF6Uw73lCxbF3ao+vn/7RXHaE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=XGl3akqaBHhaxLvjmxoWlI3ywuWEWyXVoOBI0ZxZeAqqRyDtc3FADcEGGLHjRF7IZV E05i6al0aRl0Wz+XYENa53AwHSeV4mCzV0pMAMdgW6jLz0j/pDaVghgU4/+kcWuFJrDj FqHgchsYNlRe4Wuj9AD2fQtpRIXxMRzv+disU= MIME-Version: 1.0 Received: by 10.236.154.105 with SMTP id g69mr4320048yhk.505.1307226839507; Sat, 04 Jun 2011 15:33:59 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.236.103.136 with HTTP; Sat, 4 Jun 2011 15:33:59 -0700 (PDT) In-Reply-To: <4DE8FD83.6030503@freebsd.org> References: <4DE8FA2E.4030202@FreeBSD.org> <4DE8FD83.6030503@freebsd.org> Date: Sat, 4 Jun 2011 18:33:59 -0400 X-Google-Sender-Auth: ni--62nmh3sCfELNT79lJf4LNOA Message-ID: From: Attilio Rao To: Nathan Whitehorn Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-current@freebsd.org, freebsd-stable@freebsd.org, Andriy Gapon Subject: Re: [poll / rfc] kdb_stop_cpus X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jun 2011 22:34:01 -0000 2011/6/3 Nathan Whitehorn : > On 06/03/11 10:13, Andriy Gapon wrote: >> >> I wonder if anybody uses kdb_stop_cpus with non-default value. >> If, yes, I am very interested to learn about your usecase for it. >> >> I think that the default kdb behavior is the correct one, so it doesn't >> make sense >> to have a knob to turn on incorrect behavior. >> But I may be missing something obvious. >> >> The comment in the code doesn't really satisfy me: >> /* >> =C2=A0* Flag indicating whether or not to IPI the other CPUs to stop the= m on >> =C2=A0* entering the debugger. =C2=A0Sometimes, this will result in a de= adlock as >> =C2=A0* stop_cpus() waits for the other cpus to stop, so we allow it to = be >> =C2=A0* disabled. =C2=A0In order to maximize the chances of success, use= a hard >> =C2=A0* stop for that. >> =C2=A0*/ >> >> The hard stop should be sufficiently mighty. >> Yes, I am aware of supposedly extremely rare situations where a deadlock >> could >> happen even when using hard stop. =C2=A0But I'd rather fix that than hav= e this >> switch. >> >> Oh, the commit message (from 2004) explains it: >>> >>> Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we >>> attempt to IPI other cpus when entering the debugger in order to stop >>> them while in the debugger. =C2=A0The default remains to issue the stop= ; >>> however, that can result in a hang if another cpu has interrupts disabl= ed >>> and is spinning, since the IPI won't be received and the KDB will wait >>> indefinitely. =C2=A0We probably need to add a timeout, but this is a us= eful >>> stopgap in the mean time. >> >> But that was before we started using hard stop in this context (in 2009)= . > > Some non-x86 platforms (e.g. PPC) don't support real NMIs, and so this st= ill > applies. Well, if I get Andriy's proposal right, he just wants to trim off the possibility to not stop the CPUs on entering KDB. I'm not entirely sure why there is a sysctl for disabling that and I really don't want it. Note that the missing of the NMI/privileged Interrupt is not going to be a factor on this request, unless you are worried a lot by the easy deadlock that a normal stop operation may lead. If that is the case, I think that the upcoming work on skipping locking during KDB/panic entering is going to help a lot for this case. At that point removing the possibility to turn off CPU stopping will be a good idea, IMHO. Attilio --=20 Peace can only be achieved by understanding - A. Einstein