From owner-freebsd-hackers@FreeBSD.ORG Wed Oct 17 14:27:59 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 46515B04; Wed, 17 Oct 2012 14:27:59 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5AC578FC08; Wed, 17 Oct 2012 14:27:57 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA06640; Wed, 17 Oct 2012 17:25:58 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <507EBFF6.5080904@FreeBSD.org> Date: Wed, 17 Oct 2012 17:25:58 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:16.0) Gecko/20121012 Thunderbird/16.0.1 MIME-Version: 1.0 To: mdf@FreeBSD.org Subject: Re: _mtx_lock_spin: obsolete historic handling of kdb_active and panicstr? References: <507E9498.10905@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.4.5 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-hackers , Bruce Evans X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 14:27:59 -0000 on 17/10/2012 15:07 mdf@FreeBSD.org said the following: > On Wed, Oct 17, 2012 at 4:20 AM, Andriy Gapon wrote: >> >> _mtx_lock_spin has the following check in its retry loop: >> if (i < 60000000 || kdb_active || panicstr != NULL) >> DELAY(1); >> else >> _mtx_lock_spin_failed(m); >> > [snip analysis] >> >> So I'd like to propose to remove those checks altogether. Or perhaps to >> "reverse" them and immediately induce a (possibly secondary) panic if we ever >> get to that wait loop and kdb_active || panicstr != NULL. > > The panicstr can clearly be removed. I think there can be race > conditions with entering kdb and taking a spinlock, because the > spinlock acquire will block interrupts. I don't remember if we always > NMI for kdb enter or if that's configurable. The old code was clearer > (or maybe I'm just remembering an Isilon hack); looking at > stop_cpus_hard() I don't see that it uses an NMI. kdb always uses stop_cpus_hard and stop_cpus_hard always uses NMI on x86. >From sys/x86/x86/local_apic.c: if (vector == IPI_STOP_HARD) icrlo |= APIC_DELMODE_NMI | APIC_LEVEL_ASSERT; > So a CPU can block > interrupts, then if it sees kdb_active it will spin until we leave > kdb, rather than panic. Of course this would only be relevant if the > CPU it's trying to acquire is already held; otherwise it should find > the lock unowned and this isn't relevant. And if the lock is owned by > the thread entering kdb, that would be a real panic, not a recoverable > kdb entry. > > So I think maybe the kdb_active check is also not helpful after all. > > Cheers, > matthew > -- Andriy Gapon