From owner-svn-src-all@freebsd.org Sun Nov 12 07:16:23 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ECD48E6ACEC; Sun, 12 Nov 2017 07:16:23 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id BBCCD7F435; Sun, 12 Nov 2017 07:16:22 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 8CF841042A9; Sun, 12 Nov 2017 18:16:15 +1100 (AEDT) Date: Sun, 12 Nov 2017 18:16:11 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bruce Evans cc: Mateusz Guzik , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r325734 - head/sys/amd64/amd64 In-Reply-To: <20171112151145.Q1144@besplex.bde.org> Message-ID: <20171112175214.X1521@besplex.bde.org> References: <201711120313.vAC3D1o4074273@repo.freebsd.org> <20171112151145.Q1144@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=KeqiiUQD c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=tmvYXxwOFt-jAsbioBsA:9 a=CjuIK1q_8ugA:10 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Nov 2017 07:16:24 -0000 On Sun, 12 Nov 2017, Bruce Evans wrote: > On Sun, 12 Nov 2017, Mateusz Guzik wrote: > >> Log: >> amd64: stop nesting preemption counter in spinlock_enter >> >> Discussed with: jhb > > This seems to break it. i386 still has the old code. > ... > The main broken case is: > - both levels initially 0 > - disable interrupts > - raise spinlock count to 1 > - bad race window until critical_enter() is called. Disabling hardware > interrupts doesn't prevent exceptions like debugger traps or NMIs. > Debuuger trap handlers shouldn't use critical sections or (spin) > mutexes, but NMI handlers might. When an exception handler calls > spinlock_enter(), this no longer gives a critical section and bad > things like context switches occur if the handler calls critical_enter/ > exit(). > ... > I don't like this change. The nested counting is easier to understand, > and the nested case is very rare and the critical section part of it is > very efficient (then critical_exit() is just lowering the level). Old > ... > I think the nested case is only for recursive spinlocks. So NMI handlers > should not use any spinlocks and the new bug is small (NMI handlers should > not use non-recursive spinlocks since they would deadlock, and should not > use recursive spinlocks since they don't work). NMI handlers aren't that > careful. They call printf(), and even the message buffer has been broken > to use non-recursive spinlocks. Actually, it is valid for NMI handlers to use spinlocks via mtx_trylock_spin() in the non-kdb non-panic case, and that is what my fixes for printf() do. I had confused "nesting preemption counter" (td_critnest) with interrupt nesting (the bogus td_intr_nesting_level). td_critnest was incremented for every concurrently held spinlock, so it could grow quite large without any interrupt/exception recursion. So the micro-optimization of td_critnest is useful if it is faster and fixed to work. Bruce