From owner-freebsd-current@FreeBSD.ORG Fri Aug 13 22:31:25 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 93BE416A4CF; Fri, 13 Aug 2004 22:31:25 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3635F43D1F; Fri, 13 Aug 2004 22:31:25 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i7DMTgDB096580; Fri, 13 Aug 2004 18:29:42 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i7DMTgrG096577; Fri, 13 Aug 2004 18:29:42 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Fri, 13 Aug 2004 18:29:42 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Lukas Ertl In-Reply-To: <20040813215227.F730@korben.in.tern> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@FreeBSD.org cc: Martin Blapp Subject: Re: Deadlocks with recent SMP current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Aug 2004 22:31:25 -0000 On Fri, 13 Aug 2004, Lukas Ertl wrote: > On Fri, 13 Aug 2004, Robert Watson wrote: > > > will eventually get a solid hang. I tried it on a new SMP box with an NMI > > button I received yesterday but was unable to get into the debugger. I'm > > in the process of de-obfuscating the NMI path to increase the chances of > > successfully getting into the debugger and then I'll try again to see what > > I can figure out. > > I have an NMI-enabled SMP box too, and the only message I got when > sending an NMI to the deadlocked system was 'kernel trap 12 with > interrupts disabled', but nothing more. > > If you want me to test something, feel free to shout out. I'm having trouble making headway with this box, as the NMI doesn't seem to be delivering in this hang (er, ouch). You might want to try putting a kdb_enter() just after the T_NMI in both switch statements in trap() in i386/i386/trap.c. This will cause the kernel to enter the debugger before digging into the more general NMI code, which generates log messages, etc, that may increase the chances of a problem. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research