Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Feb 1996 12:49:46 -0600 (CST)
From:      Joe Greco <jgreco@brasil.moneng.mei.com>
To:        curt@emergent.com (Curt Mayer)
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Watchdog timers
Message-ID:  <199602021849.MAA11845@brasil.moneng.mei.com>
In-Reply-To: <199602011921.LAA23294@bluewhale.emergent.com> from "Curt Mayer" at Feb 1, 96 11:21:04 am

next in thread | previous in thread | raw e-mail | index | archive | help
> hey, guys. here's a solution that smells much more like unix.
> have a daemon running on each node that is prone to hangup.
> this process wakes up every once in a while and does a system checkup.
> (stats things, pings places, looks at kernel statistics). when it see
> that things are ok, it sends a datagram to a particular machine, 
> 
> this node, the monitor, has a table in memory of all recent datagrams
> from each node. when a node hasn't been heard from for a while, it
> tells a BSR x10 controller to cycle power on the hung node. DUH.
> 
> our ISP, tlg.net used to do routing and slip with sx-16's running NOS.
> whenever a hang happened, tlg used to do a power cycle with X10's.

I already have alpha-level code that does this (and more), and hits my
alpha pager when a system dies.  However, I never cared for the BSR X10 idea
(it just sits badly with me) and I'd prefer a more controlled and elegant
solution.  :-)

... Joe

-------------------------------------------------------------------------------
Joe Greco - Systems Administrator			      jgreco@ns.sol.net
Solaria Public Access UNIX - Milwaukee, WI			   414/546-7968



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602021849.MAA11845>