From owner-freebsd-current@FreeBSD.ORG Fri Jun 3 17:57:25 2011 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BBF4E106566B; Fri, 3 Jun 2011 17:57:25 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 958EB8FC0A; Fri, 3 Jun 2011 17:57:25 +0000 (UTC) Received: from lemongrass.sec.cl.cam.ac.uk (lemongrass.sec.cl.cam.ac.uk [128.232.18.47]) by cyrus.watson.org (Postfix) with ESMTPSA id 9CA1A46B03; Fri, 3 Jun 2011 13:57:24 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: "Robert N. M. Watson" In-Reply-To: <4DE8FA2E.4030202@FreeBSD.org> Date: Fri, 3 Jun 2011 18:57:23 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <5E4D0F56-4338-4157-8BC6-17EE2831725F@FreeBSD.org> References: <4DE8FA2E.4030202@FreeBSD.org> To: Andriy Gapon X-Mailer: Apple Mail (2.1084) Cc: freebsd-current@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: [poll / rfc] kdb_stop_cpus X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jun 2011 17:57:25 -0000 On 3 Jun 2011, at 16:13, Andriy Gapon wrote: > I wonder if anybody uses kdb_stop_cpus with non-default value. > If, yes, I am very interested to learn about your usecase for it. The issue that prompted the sysctl was non-NMI IPIs being used to enter = the debugger or reboot following a core hanging with interrupts = disabled. With the switch to NMI IPIs in some of those circumstances, = life is better -- at least, on hardware that supports non-maskable IPIs. = I seem to recall sparc64 doesn't, however? Not sure about MIPS, etc. = Attilio has since significantly improved our shutdown behaviour -- = initially, the switch to NMI IPIs broke other things (because certain = IPIs then improperly preempted threads holding spinlocks), but that = pretty much all seems worked out now. Robert >=20 > I think that the default kdb behavior is the correct one, so it = doesn't make sense > to have a knob to turn on incorrect behavior. > But I may be missing something obvious. >=20 > The comment in the code doesn't really satisfy me: > /* > * Flag indicating whether or not to IPI the other CPUs to stop them on > * entering the debugger. Sometimes, this will result in a deadlock as > * stop_cpus() waits for the other cpus to stop, so we allow it to be > * disabled. In order to maximize the chances of success, use a hard > * stop for that. > */ >=20 > The hard stop should be sufficiently mighty. > Yes, I am aware of supposedly extremely rare situations where a = deadlock could > happen even when using hard stop. But I'd rather fix that than have = this switch. >=20 > Oh, the commit message (from 2004) explains it: >> Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not = we >> attempt to IPI other cpus when entering the debugger in order to stop >> them while in the debugger. The default remains to issue the stop; >> however, that can result in a hang if another cpu has interrupts = disabled >> and is spinning, since the IPI won't be received and the KDB will = wait >> indefinitely. We probably need to add a timeout, but this is a = useful >> stopgap in the mean time. >=20 > But that was before we started using hard stop in this context (in = 2009). >=20 > --=20 > Andriy Gapon