From owner-freebsd-stable@FreeBSD.ORG Sat Apr 9 22:03:37 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4D6D816A4CE for ; Sat, 9 Apr 2005 22:03:37 +0000 (GMT) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1E06E43D1D for ; Sat, 9 Apr 2005 22:03:36 +0000 (GMT) (envelope-from PeterJeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (c211-30-75-229.belrs2.nsw.optusnet.com.au [211.30.75.229]) j39M3X63009393 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sun, 10 Apr 2005 08:03:34 +1000 Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1])j39M3X7l095837; Sun, 10 Apr 2005 08:03:33 +1000 (EST) (envelope-from pjeremy@cirb503493.alcatel.com.au) Received: (from pjeremy@localhost)j39M3WpP095836; Sun, 10 Apr 2005 08:03:32 +1000 (EST) (envelope-from pjeremy) Date: Sun, 10 Apr 2005 08:03:32 +1000 From: Peter Jeremy To: Ash Message-ID: <20050409220331.GF89047@cirb503493.alcatel.com.au> References: <4258324D.8070405@speakeasy.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4258324D.8070405@speakeasy.net> User-Agent: Mutt/1.4.2i cc: freebsd-stable@freebsd.org Subject: Re: 5.4-RC1 Freezing, but pingable (may be related to gvinum) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Apr 2005 22:03:37 -0000 On Sat, 2005-Apr-09 14:51:41 -0500, Ash wrote: >By "hang" I mean that the machine stops responding to console keystrokes >(serial or otherwise) while existing ssh and nfs connections stop >responding, but do not immediately close. The machine continues to >respond to ICMP pings. I can not make new SSH connections and my >attempts eventually timeout rather than giving me a connection refused >error. This is consistent with the kernel continuing to run normally but being unable to schedule userland processes - usually due to a deadlock. Do the caps-lock, num-lock, scroll-lock buttons on a local keyboard still toggle the relevant LEDs? Assuming the LEDs toggle: Do you have "options DDB" and "options KDB" in your kernel? If so, can you break into DDB? (If not, I think you'll need to build a kernel with DDB). Once the system has hung, you need to enter DDB and run 'ps'. The output from that will give (hopefully) give an indication as to what is going wrong (and where to look next). If you've build the kernel with debugging symbols and got a dump device enabled, "call doadump()" should also generate a crashdump which will be much easier to examine. Peter