From owner-freebsd-stable@FreeBSD.ORG Mon Aug 23 10:34:15 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED40210656A3 for ; Mon, 23 Aug 2010 10:34:14 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.westchester.pa.mail.comcast.net (qmta03.westchester.pa.mail.comcast.net [76.96.62.32]) by mx1.freebsd.org (Postfix) with ESMTP id 9D2B28FC1C for ; Mon, 23 Aug 2010 10:34:14 +0000 (UTC) Received: from omta22.westchester.pa.mail.comcast.net ([76.96.62.73]) by qmta03.westchester.pa.mail.comcast.net with comcast id xmVK1e0031ap0As53maEHU; Mon, 23 Aug 2010 10:34:14 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta22.westchester.pa.mail.comcast.net with comcast id xmaD1e00A3LrwQ23imaEKl; Mon, 23 Aug 2010 10:34:14 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 67B379B425; Mon, 23 Aug 2010 03:34:12 -0700 (PDT) Date: Mon, 23 Aug 2010 03:34:12 -0700 From: Jeremy Chadwick To: freebsd-stable@freebsd.org Message-ID: <20100823103412.GA21044@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Cc: phk@freebsd.org Subject: Watchdog not being disabled while dumping core X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Aug 2010 10:34:15 -0000 It was brought to my attention that on FreeBSD with a hardware watchdog in use (e.g. ichwd(4) + watchdogd(8)), once the kernel panics, it's quite possible for the watchdog to fire (reboot the system) once the panic has happened. This issue basically inhibits the ability for a system with a hardware watchdog in place to be able to successfully complete doadump(). There's confirmations of this problem dating all the way back to 2005: PR kern/82219, opened in 2005: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/82219 PR bin/145183, opened in 2010 (not sure if this is the same): http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/145183 Confirmation that the problem still exists today (first paragraph): http://lists.freebsd.org/pipermail/freebsd-stable/2010-August/058350.html On Linux, it appears that they've worked around this problem by using what's called a "pretimeout" (basically a way to get the watchdog to become delayed, thus not firing during important tasks): http://www.mjmwired.net/kernel/Documentation/watchdog/watchdog-api.txt According to watchdog(4), it looks like the kernel setting WD_PASSIVE immediately upon entering panic would solve the problem, but the BUGS section indicates WD_PASSIVE hasn't been implemented (returns ENOSYS). Thoughts on solving this dilemma? -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |