From owner-freebsd-hackers@FreeBSD.ORG Wed Sep 2 12:44:44 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C2C561065670 for ; Wed, 2 Sep 2009 12:44:44 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 94F678FC20 for ; Wed, 2 Sep 2009 12:44:44 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 4476246B2D; Wed, 2 Sep 2009 08:44:44 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 9AF138A041; Wed, 2 Sep 2009 08:44:43 -0400 (EDT) From: John Baldwin To: freebsd-hackers@freebsd.org Date: Wed, 2 Sep 2009 08:35:47 -0400 User-Agent: KMail/1.9.7 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200909020835.47358.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Wed, 02 Sep 2009 08:44:43 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Peter Much Subject: Re: crashdump "watchdog timeout" - Howto get useful information? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Sep 2009 12:44:44 -0000 On Tuesday 01 September 2009 1:45:44 pm Peter Much wrote: > > Dear all, > > could anybody share some insight (or pointers to docs) on how to > approach an analysis of a "watchdog timeout" crashdump? > > I hopefully have the necessities in place (that is, I can load > the dump into ddd and actually see things). > > But I have no real idea about where to start looking for interesting > things - some structure from where to unroll what the system was > doing (or not doing). > The "developers handbook" mainly explains about figuring the cause > of the crash - but in my case this is obvious, it is the watchdog I > have configured. > > Since this is a reproducible issue, ideas on things that could be > configured beforehand could also be useful. I would examine the state of the processes in the system first. If all the CPUs are idle but some threads are blocked on locks you might have a deadlock, etc. You can use the gdb scripts at http://www.FreeBSD.org/~jhb/gdb/ in kgdb to figure some of that stuff out (source gdb6 from within gdb. I usually start with 'ps'). -- John Baldwin