Date: Fri, 2 Feb 1996 12:49:46 -0600 (CST) From: Joe Greco <jgreco@brasil.moneng.mei.com> To: curt@emergent.com (Curt Mayer) Cc: freebsd-hackers@freebsd.org Subject: Re: Watchdog timers Message-ID: <199602021849.MAA11845@brasil.moneng.mei.com> In-Reply-To: <199602011921.LAA23294@bluewhale.emergent.com> from "Curt Mayer" at Feb 1, 96 11:21:04 am
next in thread | previous in thread | raw e-mail | index | archive | help
> hey, guys. here's a solution that smells much more like unix. > have a daemon running on each node that is prone to hangup. > this process wakes up every once in a while and does a system checkup. > (stats things, pings places, looks at kernel statistics). when it see > that things are ok, it sends a datagram to a particular machine, > > this node, the monitor, has a table in memory of all recent datagrams > from each node. when a node hasn't been heard from for a while, it > tells a BSR x10 controller to cycle power on the hung node. DUH. > > our ISP, tlg.net used to do routing and slip with sx-16's running NOS. > whenever a hang happened, tlg used to do a power cycle with X10's. I already have alpha-level code that does this (and more), and hits my alpha pager when a system dies. However, I never cared for the BSR X10 idea (it just sits badly with me) and I'd prefer a more controlled and elegant solution. :-) ... Joe ------------------------------------------------------------------------------- Joe Greco - Systems Administrator jgreco@ns.sol.net Solaria Public Access UNIX - Milwaukee, WI 414/546-7968
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602021849.MAA11845>
