From owner-freebsd-stable@FreeBSD.ORG Fri Feb 22 16:47:41 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B2ED3EE5 for ; Fri, 22 Feb 2013 16:47:41 +0000 (UTC) (envelope-from freebsd-stable@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) by mx1.freebsd.org (Postfix) with ESMTP id 6E4EDB0C for ; Fri, 22 Feb 2013 16:47:41 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1U8vmp-0000rP-UK for freebsd-stable@freebsd.org; Fri, 22 Feb 2013 17:47:51 +0100 Received: from 208.85.208.53 ([208.85.208.53]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 22 Feb 2013 17:47:51 +0100 Received: from atkin901 by 208.85.208.53 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 22 Feb 2013 17:47:51 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-stable@freebsd.org From: Mark Atkinson Subject: Re: watchdogs Date: Fri, 22 Feb 2013 08:47:20 -0800 Lines: 44 Message-ID: References: <512525C1.1070502@norma.perm.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: 208.85.208.53 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130121 Thunderbird/17.0.2 In-Reply-To: <512525C1.1070502@norma.perm.ru> X-Enigmail-Version: 1.4.6 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Feb 2013 16:47:41 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/20/2013 11:36, Eugene M. Zheganin wrote: > Hi. > > I have a bunch of FreeBSDs that hangs (and I really want to do > something to fight this). May be it's the zfs or may be it's the pf > (I also have a bunch of really stable ones, so it's hard to isolate > and tell). Since 9.x hang more often I suppose it's pf. I use > ichwd.ko and watchdogd to reboot a machine when it hangs. It works > pretty well; I'm also working on a various WITNESS/INVARIANTS stuff > and I'm trying to report it to gnats, but obviously it would be > much nicer if the system would panic and leave some debuggable core > after a hang (so far I don't have any, so I can only guess). I've > read about software watchdog in kernel and I doesn'y quite > understand: it's said that kernel software watchdog is able to > panic when a deadlock occurs. Can this be achieved with ichwd ? > Another one: as far as I understand ichwd reboots my machine on a > hardware level, right ? So am I right saying that software watchdog > can be, in theory, also deadlocked, thus, being kinda less reliable > solution ? I just want to /metoo that I have 32bit/i386 box running zfs, pf and - -current that is hardlocking randomly (usually has an uptime for a few days to a couple weeks). SW_WATCHDOG won't fire when it locks so it must be locking pretty fast. I just noticed that ichwd will load on this box, so I'll try that instead, but now I'm wondering if the SW_WATCHDOG kernel will interfere or rather if watchdogd is smart enough to handle both? This box used to occasionally panic on the ZFS stack panic so I did the KSTACK_PAGES=4 change to the kernel and now it just hardlocks. I'm not saying they are related. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlEnoRgACgkQrDN5kXnx8ybJeACbBjpHrQxeZhkjavnoeBgjEJ9W dDUAnipfLgIuUCbM6mk6/bcrl7AphHxC =84T/ -----END PGP SIGNATURE-----