From owner-svn-src-head@freebsd.org Wed Oct 5 00:49:46 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6E3BCAF6828; Wed, 5 Oct 2016 00:49:46 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id CEE0E855; Wed, 5 Oct 2016 00:49:45 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c122-106-163-102.carlnfd1.nsw.optusnet.com.au (c122-106-163-102.carlnfd1.nsw.optusnet.com.au [122.106.163.102]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 9871F3C254B; Wed, 5 Oct 2016 11:19:17 +1100 (AEDT) Date: Wed, 5 Oct 2016 11:19:10 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Gleb Smirnoff cc: Eric van Gyzen , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r306346 - head/sys/kern In-Reply-To: <20161004205600.GN23123@FreeBSD.org> Message-ID: <20161005101932.U984@besplex.bde.org> References: <201609261530.u8QFUUZd020174@repo.freebsd.org> <20161004205600.GN23123@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=CoZCCSMD c=1 sm=1 tr=0 a=IXAyHK3mFcy+1kvmsno0Fw==:117 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=kj9zAlcOel0A:10 a=U5UPkUwmBUP_buS78hMA:9 a=CjuIK1q_8ugA:10 a=chvjmp5bT-K0Np4W8Gpx:22 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Oct 2016 00:49:46 -0000 On Tue, 4 Oct 2016, Gleb Smirnoff wrote: > On Mon, Sep 26, 2016 at 03:30:30PM +0000, Eric van Gyzen wrote: > E> ... > E> Modified: head/sys/kern/kern_mutex.c > E> ============================================================================== > E> --- head/sys/kern/kern_mutex.c Mon Sep 26 15:03:31 2016 (r306345) > E> +++ head/sys/kern/kern_mutex.c Mon Sep 26 15:30:30 2016 (r306346) > E> @@ -924,7 +924,7 @@ __mtx_assert(const volatile uintptr_t *c > E> { > E> const struct mtx *m; > E> > E> - if (panicstr != NULL || dumping) > E> + if (panicstr != NULL || dumping || SCHEDULER_STOPPED()) > E> return; > > I wonder if all this disjunct can be reduced just to SCHEDULER_STOPPED()? > Positive panicstr and dumping imply scheduler stopped. 'dumping' doesn't imply SCHEDULER_STOPPED(). Checking 'dumping' here seems to be just an old bug. It just breaks __mtx_assert(), while all other mutex operations work normally for dumping without panicing. kern doesn't have this bug anywhere else. It just has style bugs for most references to 'dumping': X kern_mutex.c: * re-enable interrupts while dumping core. This is under another recent fix involving SCHEDULER_STOPPED(). X kern_mutex.c: if (panicstr != NULL || dumping || SCHEDULER_STOPPED()) Broken. X kern_shutdown.c:int dumping; /* system is dumping */ Banal comment. X kern_shutdown.c: if (dumping) X kern_shutdown.c: dumping++; X kern_shutdown.c: dumping--; Obfuscation of a boolean by manually optimizing its setting for PDP-11. X kern_shutdown.c: if ((howto & (RB_HALT|RB_DUMP)) == RB_DUMP && !cold && !dumping) Missing spaced around binary operator. X sched_4bsd.c: if (panicstr != NULL || pri >= cpri || cold /* || dumping */ || Here the bogus test for dumping is commented out. This is in maybe_preempt(). There is a TD_IS_INHIBITED() check. sched_ule.c has similar code without the commented-out 'dumping'. Neither checks SCHEDULER_STOPPED(). It is certainly useless to preempt if SCHEDULER_STOPPED(), but perhaps checking it is unnecessary. I don't like the design or implementation of SCHEDULER_STOPPED(). It is a hack to specially break mutexes while panicing. Panicing stops the scheduler and tries to stop all other CPUs (fixed in my version to either actually stop them all, with special stopping for NMI handlers, or hang waiting) and we set the flag curthread->td_stopsched to indicate that the scheduler is specially stopped for panic. Bugs in the implementation of this include: - 2 bytes are wasted in struct thread to hold the flag. This is a dubious obfuscation of an old version that used a global flag - despite this optimization, all mutex operations should be slowed down by testing this flag - however, the inlined mutex operations don't test this flag. This gives inconsistencies. __mtx_assert() needed the fix in this commit to not detect these inconsistencies Bugs in the design of this include: - SCHEDULER_STOPPED() doesn't really mean that the scheduler has stopped. It means that we are panicing and have (tried to) stop other CPUs and want to forcer all mutex operations to silently succeed without keeping the mutex state consistent. This is fragile. Nothing can depend on mutexes working or mutex assertions finding inconsistencies or on mtx_owned() working, so the SCHEDULER_STOPPED() check must be done in more than central mutex code. It is confusing that this condition doesn't mean that the scheduler is stopped. It means that a certain state in panicing has been reached. Bruce