Date: Wed, 17 Oct 2012 05:07:01 -0700 From: mdf@FreeBSD.org To: Andriy Gapon <avg@freebsd.org> Cc: freebsd-hackers <freebsd-hackers@freebsd.org>, Bruce Evans <brde@optusnet.com.au> Subject: Re: _mtx_lock_spin: obsolete historic handling of kdb_active and panicstr? Message-ID: <CAMBSHm_P0Af3CemFo0X-_HgJNdndKDRD9Fav1yQwh=8T35rWdg@mail.gmail.com> In-Reply-To: <507E9498.10905@FreeBSD.org> References: <507E9498.10905@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Oct 17, 2012 at 4:20 AM, Andriy Gapon <avg@freebsd.org> wrote: > > _mtx_lock_spin has the following check in its retry loop: > if (i < 60000000 || kdb_active || panicstr != NULL) > DELAY(1); > else > _mtx_lock_spin_failed(m); > [snip analysis] > > So I'd like to propose to remove those checks altogether. Or perhaps to > "reverse" them and immediately induce a (possibly secondary) panic if we ever > get to that wait loop and kdb_active || panicstr != NULL. The panicstr can clearly be removed. I think there can be race conditions with entering kdb and taking a spinlock, because the spinlock acquire will block interrupts. I don't remember if we always NMI for kdb enter or if that's configurable. The old code was clearer (or maybe I'm just remembering an Isilon hack); looking at stop_cpus_hard() I don't see that it uses an NMI. So a CPU can block interrupts, then if it sees kdb_active it will spin until we leave kdb, rather than panic. Of course this would only be relevant if the CPU it's trying to acquire is already held; otherwise it should find the lock unowned and this isn't relevant. And if the lock is owned by the thread entering kdb, that would be a real panic, not a recoverable kdb entry. So I think maybe the kdb_active check is also not helpful after all. Cheers, matthew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMBSHm_P0Af3CemFo0X-_HgJNdndKDRD9Fav1yQwh=8T35rWdg>