Date: Sat, 30 Apr 2011 00:58:48 +0300 From: Mikolaj Golub <trociny@freebsd.org> To: Denny Schierz <linuxmail@4lin.net> Cc: freebsd-stable <freebsd-stable@freebsd.org> Subject: Re: way for failover zpool (no HAST needed): hastmon Message-ID: <86mxj8lnxj.fsf@kopusha.home.net> In-Reply-To: <1303996942.4232.160.camel@pcdenny> (Denny Schierz's message of "Thu, 28 Apr 2011 15:22:22 %2B0200") References: <1301397421.11113.250.camel@pcdenny> <86ipv1ll4f.fsf@kopusha.home.net> <1303905911.4232.86.camel@pcdenny> <861v0nrdkc.fsf@in138.ua3> <1303996942.4232.160.camel@pcdenny>
next in thread | previous in thread | raw e-mail | index | archive | help
Oops, just noticed this mail :-) Denny sent me another message privately and I hope I answered his questions but will answer to this message too, in case someone is interested. On Thu, 28 Apr 2011 15:22:22 +0200 Denny Schierz wrote: DS> hi, DS> ok, here we go: I've installed hastmon and both FreeBSD nodes and one on DS> Linux Debian as watchdog: DS> Simple setup: DS> DS> # cat /etc.local/hastmon.conf DS> resource sanip { DS> exec /usr/local/_rbg/bin/san-ip DS> friends iscsihead-m iscsihead-s nos DS> on iscsihead-m { DS> remote tcp4://iscsihead-s DS> priority 0 DS> } DS> on iscsihead-s { DS> remote tcp4://iscsihead-m DS> priority 1 DS> } DS> on linux { DS> remote tcp4://iscsihead-m tcp4://iscsihead-s DS> } DS> } DS> It works only half. DS> The simple script adds/remove an alias for the em0 and for status it DS> does a ping -c 1 to the global ip. After tell every host, what is role DS> is, I get on the primary "state unknown", in the secondary "state run" DS> and watchdog for the Linux host. It is difficult to tell without additional information what happened. It might be that your '/usr/local/_rbg/bin/san-ip status' was returning unknown status. In this case running manually /usr/local/_rbg/bin/san-ip status; echo $? might be helpful. And logs too :-). DS> Than I rebooted the primary, the secondary take over and executed the DS> script. After the primary was reachable again, he doesn't get the DS> secondary role, but init/unknown. DS> The same happens, in the opposite: DS> from Linux: DS> hastmonctl status DS> sanip: DS> role: watchdog DS> exec: /usr/local/_rbg/bin/san-ip DS> remote: DS> tcp4://iscsihead-m (primary/run) DS> tcp4://iscsihead-s (init/unknown) DS> state: run DS> attempts: 0 from 5 DS> complaints: 0 for last 60 sec (threshold 3) DS> heartbeat: 10 sec DS> from iscsihead-s: DS> hastmonctl status DS> sanip: DS> role: init DS> exec: /usr/local/_rbg/bin/san-ip DS> remote: DS> tcp4://iscsihead-m DS> state: unknown DS> attempts: 0 from 5 DS> complaints: 0 for last 60 sec (threshold 3) DS> heartbeat: 10 sec DS> and last from iscsihead-m DS> hastmonctl status DS> sanip: DS> role: primary DS> exec: /usr/local/_rbg/bin/san-ip DS> remote: DS> tcp4://iscsihead-s (disconnected) DS> state: run DS> attempts: 0 from 5 DS> complaints: 0 for last 60 sec (threshold 3) DS> heartbeat: 10 sec DS> If I take a look into the logfile from the iscsihead-m: DS> [sanip] (primary) Remote node acts as init for the resource and not as DS> secondary. DS> [sanip] (primary) Handshake header from tcp4://iscsihead-s has no DS> 'token' field. DS> Do I have missed something? DS> cu denny This is expected behavior. After start hastmon is in init role. You need to setup the role you want manually or via a startup script. This is because you might want different configurations depending on your requirenments: 1) After start the role is set manually by administrator (useful e.g. if you prefer to investigate crashed host before returning it back to cluster). 2) After star the node is switched to secondary automatically (by rc script). If all cluster nodes are configured to be in secondary on startup, and all started simultaneously watchdog will figure out that there is no primary and will send complaints to all secondary nodes. The nodes will be trying to switch to master simultaneously and the node with highest priority will win. 3) One node that has highest priority configures is set on startup always to primary. All others are to secondary. With this configuration if the primary fails, secondary switches to primary, then when the initial primary comes back it becomes primary again automatically. -- Mikolaj Golub
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86mxj8lnxj.fsf>