Date: Wed, 3 Mar 2010 09:13:32 -0800 From: Freddie Cash <fjwcash@gmail.com> To: fs@freebsd.org Subject: PoC: ZFS fail-over with HAST + carp(4) + devd Message-ID: <b269bc571003030913u3987c7dfre51fd7860b89f7e4@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
[Not sure if this should go to just fs@ or possibly current@ as well. I'll start with just fs@.] Thought I'd pass this along. It's a proof-of-concept setup I've been using to test HAST fail-over of a ZFS pool, using devd and carp(4). The original impetus for doing this was that ucarp doesn't work (for me) within a VirtualBox VM. Just hangs the VM. And, I prefer to use FreeBSD base tools whenever possible, so I thought I'd try and get it to work with carp(4). I know this isn't perfect as it (currently) relies on a "magic constant" and doesn't cover all the possible failure modes, but thought I'd pass it along to get your input, comments, criticisms, suggestions, etc. With a bit more work, it could be generalised a bit more to, for example, pull the resources list from /etc/hast.conf, and to work with non-ZFS setups. Perhaps someday it could be useful an an example in the HAST samples/ directory.?. With this setup, I can pull the plug on carp0 on the master node, and the hast devices and ZFS pool fail-over to the slave. And if I pull the plug on carp0 on the slave, everything fails over to the master again. And it works nicely with carp preempt enabled on the master node. Add the following stanzas to /etc/devd.conf: notify 10 { match "system" "IFNET"; match "subsystem" "carp0"; match "type" "LINK_UP"; action "/usr/local/bin/carp-hast-switch master"; }; notify 10 { match "system" "IFNET"; match "subsystem" "carp0"; match "type" "LINK_DOWN"; action "/usr/local/bin/carp-hast-switch slave"; }; Contents of /usr/local/bin/carp-hast-switch: #!/bin/sh # The names of the HAST resources, as listed in hast.conf resources="disk01 disk02 disk03 disk04" # The name of the ZFS pool built on top of HAST resources pool="hapool" case "$1" in master) logger -p local0.debug -t hast "Switching to primary provider for ${resources}." sleep 30 # Wait for any "hastd secondary" processes to stop for disk in ${resources}; do while $( pgrep -lf "hastd: ${disk} \(secondary\)" > /dev/null 2>&1 ); do sleep 1 done # Switch role for each disk hastctl role primary ${disk} if [ $? -ne 0 ]; then logger -p local0.debug -t hast "Unable to change role to primary for resource ${disk}." exit 1 fi done # Wait for the /dev/hast/* devices to appear for disk in ${resources}; do for I in $( jot 60 ); do [ -c "/dev/hast/${disk}" ] && break sleep 0.5 done if [ ! -c "/dev/hast/${disk}" ]; then logger -p local0.debug -t hast "GEOM provider /dev/hast/${disk} did not appear." exit 1 fi done logger -p local0.debug -t hast "Role for HAST resources ${resources} switched to primary." # Import the ZFS pool; has to be done forcibly due to hostid issues zpool import -f -d /dev/hast ${pool} 2>&1 if [ $? -ne 0 ]; then logger -p local0.debug -t hast "ZFS pool import for ${hapool} failed." exit 1 fi logger -p local0.debug -t hast "ZFS pool ${pool} imported." ;; slave) logger -p local0.debug -t hast "Switching to secondary provider for ${resources}." # Export the ZFS pool; has to be done forcibly in case the hast resources have already switched zpool export -f ${pool} 2>&1 if [ $? -ne 0 ]; then logger -p local0.debug -t hast "Unable to export the pool ${pool}." exit 1 fi # Switch roles for the HAST resources for disk in ${resources}; do hastctl role secondary ${disk} 2>&1 if [ $? -ne 0 ]; then logger -p local0.debug -t hast "Unable to switch role to secondary for resource ${disk}." exit 1 fi logger -p local0.debug -t hast "Role switched to secondary for resource ${disk}." done ;; esac -- Freddie Cash fjwcash@gmail.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b269bc571003030913u3987c7dfre51fd7860b89f7e4>