From owner-freebsd-stable@FreeBSD.ORG Tue Feb 10 21:05:35 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1131816A4CE for ; Tue, 10 Feb 2004 21:05:35 -0800 (PST) Received: from alcanet.com.au (mail2.alcanet.com.au [203.62.196.17]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5952143D1D for ; Tue, 10 Feb 2004 21:05:34 -0800 (PST) (envelope-from peter.jeremy@alcatel.com.au) Received: from sydsmtp02.alcatel.com.au (IDENT:root@localhost.localdomain [127.0.0.1])i1B55Vjw026585 for ; Wed, 11 Feb 2004 16:05:32 +1100 Received: from gsmx07.alcatel.com.au ([139.188.20.247]) by sydsmtp02.alcatel.com.au (Lotus Domino Release 5.0.12) with ESMTP id 2004021116053107:83956 ; Wed, 11 Feb 2004 16:05:31 +1100 Received: from gsmx07.alcatel.com.au (localhost [127.0.0.1]) i1B55UHQ022243 for ; Wed, 11 Feb 2004 16:05:30 +1100 (EST) (envelope-from peter.jeremy@alcatel.com.au) Received: (from jeremyp@localhost) by gsmx07.alcatel.com.au (8.12.9p2/8.12.9/Submit) id i1B55Uwd022242 for freebsd-stable@freebsd.org; Wed, 11 Feb 2004 16:05:30 +1100 (EST) (envelope-from peter.jeremy@alcatel.com.au) Date: Wed, 11 Feb 2004 16:05:30 +1100 From: Peter Jeremy To: freebsd-stable@freebsd.org Message-ID: <20040211050530.GU20549@gsmx07.alcatel.com.au> Mail-Followup-To: freebsd-stable@freebsd.org Mime-Version: 1.0 User-Agent: Mutt/1.4.1i X-MIMETrack: Itemize by SMTP Server on SYDSMTP02/AlcatelAustralia(Release 5.0.12 |February 13, 2003) at 11/02/2004 04:05:31 PM,|February 13, 2003) at 11/02/2004 04:05:32 PM, Serialize complete at 11/02/2004 04:05:32 PM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Subject: Locating a possible interrupt storm X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Feb 2004 05:05:35 -0000 I suspect I have a machine that appears to be suffering from a severe interrupt problem: It becomes non-responsive for all practical purposes. Entering 'Ctrl-Alt-Esc' eventually results in a very slow dribble of characters with 'db> ' being reached after about 1/2 hr. Typing characters at the prompt resulted in nothing being echoed after about 2 hours (and which point it got reset). This is a 2.4GB Xeon CPU so I would expect a slightly zippier response :-). My working hypothesis is that an interrupt storm is preventing the system from doing anything other than acknowledge interrupts. Does this sound reasonable? If not, can anyone suggest any alternate hypotheses? In either case, does anyone have any ideas on how to get a crashdump when there's (effectively) no response to DDB? Is enabling the logical CPUs (machdep.cpu_idle_hlt=0) likely to have any effect? The machine is running 4.9p1. It was previously running 4.8p7 and this problem was not noticed - though it's not clear that it wasn't there. -- Peter Jeremy