From owner-freebsd-questions@FreeBSD.ORG Mon Feb 23 21:42:49 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F0E1D1065672 for ; Mon, 23 Feb 2009 21:42:49 +0000 (UTC) (envelope-from psteele@maxiscale.com) Received: from exprod7og109.obsmtp.com (exprod7og109.obsmtp.com [64.18.2.171]) by mx1.freebsd.org (Postfix) with SMTP id A8F808FC1A for ; Mon, 23 Feb 2009 21:42:49 +0000 (UTC) (envelope-from psteele@maxiscale.com) Received: from source ([209.85.200.172]) by exprod7ob109.postini.com ([64.18.6.12]) with SMTP ID DSNKSaMYWbH2chsutlAvrLP96MMfZfqf1BDY@postini.com; Mon, 23 Feb 2009 13:42:49 PST Received: by wf-out-1314.google.com with SMTP id 24so2354204wfg.5 for ; Mon, 23 Feb 2009 13:42:49 -0800 (PST) Received: by 10.142.240.9 with SMTP id n9mr2160309wfh.0.1235423496897; Mon, 23 Feb 2009 13:11:36 -0800 (PST) Received: from localhost ([76.231.178.131]) by mx.google.com with ESMTPS id k37sm16396918rvb.1.2009.02.23.13.11.36 (version=SSLv3 cipher=RC4-MD5); Mon, 23 Feb 2009 13:11:36 -0800 (PST) Date: Mon, 23 Feb 2009 13:11:38 -0800 (PST) From: Peter Steele To: freebsd-questions@freebsd.org Message-ID: <33296720.221235423494723.JavaMail.HALO$@halo> In-Reply-To: <19230938.201235423210003.JavaMail.HALO$@halo> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: What is correct way to enable watchdog? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2009 21:42:50 -0000 We have our systems configured with the watchdog enabled, with /etc/rc.d/watchdogd defined as . /etc/rc.subr name="watchdogd" rcvar="`set_rcvar`" command="/usr/sbin/${name}" command_args="-s 10 -t 300" pidfile="/var/run/${name}.pid" load_rc_config $name run_rc_command "$1" We assumed this would give us a watchdog timeout of 300 seconds (5 minutes), meaning a system would not reboot unless it is non-responsive for five minutes. However, in a recent stress test we had unexplained spontaneous reboots on two systems, with no logs of any kind to indicate why the systems rebooted. Do we have something wrong with how the watchdog should be configured?