From owner-freebsd-bugs@freebsd.org Mon Nov 9 06:04:37 2015 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A519AA2937E for ; Mon, 9 Nov 2015 06:04:37 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 6E49014D9 for ; Mon, 9 Nov 2015 06:04:37 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197]) by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id D4C30D4837A for ; Mon, 9 Nov 2015 17:04:29 +1100 (AEDT) Date: Mon, 9 Nov 2015 17:04:29 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org cc: freebsd-bugs@freebsd.org Subject: Re: [Bug 204376] Cavium ThunderX system heavily loaded while at db> prompt In-Reply-To: Message-ID: <20151109162405.M969@besplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=cK4dyQqN c=1 sm=1 tr=0 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=9cW_t1CCXrUA:10 a=JzwRw_2MAAAA:8 a=kj9zAlcOel0A:10 a=6I5d2MoRAAAA:8 a=u363m9Pi7FwXk-8PRyQA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Nov 2015 06:04:37 -0000 On Mon, 9 Nov 2015 bugzilla-noreply@freebsd.org wrote: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204376 > > --- Comment #2 from Conrad E. Meyer --- > If ARM is anything like amd64, it just spinwaits in IPI_STOP (waiting for the > CPU > to be re-enabled). On amd64 you could reduce it to 2 CPUs spinning pretty > easily > (hlt any non-panic and non-BSP core -- they'll never be needed until reboot). > But that still leaves 2 CPUs spinning. > > The patch attempted to hlt all non-panic CPUs in IPI_STOP, but leave interrupts > enabled so they could be woken again. This does Not Work Well in panic context > (I forget the details, but if you've paniced you really don't want normal > interrupt > code running on the non-ddb CPU(s)). Enabling normal interrupts breaks ddb context too. ddb is already broken in restarting other CPUs when it single steps. This usually enables interrupts on other CPUs (if not the current one), so the state might be completely different after you step a single instruction. Just like it might be in normal operation for unlocked states, but more so since in normal operation the single instruction runs in a few cycle but for single stepping it takes thousands or millions of cycles in real time (and the other CPUs run many of thos cycles in real time after they are restarted before they are stopped again). But it is inconvenient for the state that you are trying to debug to change much. Bruce