From owner-freebsd-current@FreeBSD.ORG Fri Jun 3 15:54:22 2011 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE51D1065672; Fri, 3 Jun 2011 15:54:22 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from mail.icecube.wisc.edu (trout.icecube.wisc.edu [128.104.255.119]) by mx1.freebsd.org (Postfix) with ESMTP id 91D6F8FC12; Fri, 3 Jun 2011 15:54:22 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.icecube.wisc.edu (Postfix) with ESMTP id D3C7D58142; Fri, 3 Jun 2011 10:28:03 -0500 (CDT) X-Virus-Scanned: amavisd-new at icecube.wisc.edu Received: from mail.icecube.wisc.edu ([127.0.0.1]) by localhost (trout.icecube.wisc.edu [127.0.0.1]) (amavisd-new, port 10030) with ESMTP id csROGH26NRuy; Fri, 3 Jun 2011 10:28:03 -0500 (CDT) Received: from wanderer.tachypleus.net (i3-dhcp-172-16-223-128.icecube.wisc.edu [172.16.223.128]) by mail.icecube.wisc.edu (Postfix) with ESMTP id 9B78F5813A; Fri, 3 Jun 2011 10:28:03 -0500 (CDT) Message-ID: <4DE8FD83.6030503@freebsd.org> Date: Fri, 03 Jun 2011 10:28:03 -0500 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110516 Thunderbird/3.1.10 MIME-Version: 1.0 To: Andriy Gapon References: <4DE8FA2E.4030202@FreeBSD.org> In-Reply-To: <4DE8FA2E.4030202@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: [poll / rfc] kdb_stop_cpus X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jun 2011 15:54:22 -0000 On 06/03/11 10:13, Andriy Gapon wrote: > > I wonder if anybody uses kdb_stop_cpus with non-default value. > If, yes, I am very interested to learn about your usecase for it. > > I think that the default kdb behavior is the correct one, so it doesn't make sense > to have a knob to turn on incorrect behavior. > But I may be missing something obvious. > > The comment in the code doesn't really satisfy me: > /* > * Flag indicating whether or not to IPI the other CPUs to stop them on > * entering the debugger. Sometimes, this will result in a deadlock as > * stop_cpus() waits for the other cpus to stop, so we allow it to be > * disabled. In order to maximize the chances of success, use a hard > * stop for that. > */ > > The hard stop should be sufficiently mighty. > Yes, I am aware of supposedly extremely rare situations where a deadlock could > happen even when using hard stop. But I'd rather fix that than have this switch. > > Oh, the commit message (from 2004) explains it: >> Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we >> attempt to IPI other cpus when entering the debugger in order to stop >> them while in the debugger. The default remains to issue the stop; >> however, that can result in a hang if another cpu has interrupts disabled >> and is spinning, since the IPI won't be received and the KDB will wait >> indefinitely. We probably need to add a timeout, but this is a useful >> stopgap in the mean time. > > But that was before we started using hard stop in this context (in 2009). Some non-x86 platforms (e.g. PPC) don't support real NMIs, and so this still applies. -Nathan