From owner-freebsd-fs@FreeBSD.ORG Wed Mar 3 17:13:39 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C3E8106564A for ; Wed, 3 Mar 2010 17:13:39 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 34F718FC14 for ; Wed, 3 Mar 2010 17:13:38 +0000 (UTC) Received: by pwj1 with SMTP id 1so1119522pwj.13 for ; Wed, 03 Mar 2010 09:13:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=Kfg6VanoE9EsenYy9PFBvIEFieVWAh0UGRPgd1ONXb8=; b=DymTrXHQ7SkC8w/7DX8F6+h1lqG8L2rJqgeWLMQ5Aj2sgTL3TAectqcaUEGp/UGpO7 1pP43Y82BSC8Wdoa9iuqfMlh09lcz/z1ay9seS8qByjkX1kGhtUlRy4IRZwG0OPXn+3v +oEfmHf3yR3W+NCBHn/78/gTRr41JEKwP7fm4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=OYaChLipPmlpF5emxMBLqMwUbTIJKZYTktNqO4v1uR1S9hsPPGH4Y4eWZ2tN67RPY/ 6/Ii/3LXbiQzv5kUIZtTOlY/F3p5XAJULSTks2V/pdX+Sallk1M5FylCmnXZ2V5XixzZ xAJnz1QhXKWtTy74aJjom6FXJQw71E+d4NxzU= MIME-Version: 1.0 Received: by 10.140.57.15 with SMTP id f15mr1036325rva.262.1267636412882; Wed, 03 Mar 2010 09:13:32 -0800 (PST) Date: Wed, 3 Mar 2010 09:13:32 -0800 Message-ID: From: Freddie Cash To: fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: PoC: ZFS fail-over with HAST + carp(4) + devd X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Mar 2010 17:13:39 -0000 [Not sure if this should go to just fs@ or possibly current@ as well. I'll start with just fs@.] Thought I'd pass this along. It's a proof-of-concept setup I've been using to test HAST fail-over of a ZFS pool, using devd and carp(4). The original impetus for doing this was that ucarp doesn't work (for me) within a VirtualBox VM. Just hangs the VM. And, I prefer to use FreeBSD base tools whenever possible, so I thought I'd try and get it to work with carp(4). I know this isn't perfect as it (currently) relies on a "magic constant" and doesn't cover all the possible failure modes, but thought I'd pass it along to get your input, comments, criticisms, suggestions, etc. With a bit more work, it could be generalised a bit more to, for example, pull the resources list from /etc/hast.conf, and to work with non-ZFS setups. Perhaps someday it could be useful an an example in the HAST samples/ directory.?. With this setup, I can pull the plug on carp0 on the master node, and the hast devices and ZFS pool fail-over to the slave. And if I pull the plug on carp0 on the slave, everything fails over to the master again. And it works nicely with carp preempt enabled on the master node. Add the following stanzas to /etc/devd.conf: notify 10 { match "system" "IFNET"; match "subsystem" "carp0"; match "type" "LINK_UP"; action "/usr/local/bin/carp-hast-switch master"; }; notify 10 { match "system" "IFNET"; match "subsystem" "carp0"; match "type" "LINK_DOWN"; action "/usr/local/bin/carp-hast-switch slave"; }; Contents of /usr/local/bin/carp-hast-switch: #!/bin/sh # The names of the HAST resources, as listed in hast.conf resources="disk01 disk02 disk03 disk04" # The name of the ZFS pool built on top of HAST resources pool="hapool" case "$1" in master) logger -p local0.debug -t hast "Switching to primary provider for ${resources}." sleep 30 # Wait for any "hastd secondary" processes to stop for disk in ${resources}; do while $( pgrep -lf "hastd: ${disk} \(secondary\)" > /dev/null 2>&1 ); do sleep 1 done # Switch role for each disk hastctl role primary ${disk} if [ $? -ne 0 ]; then logger -p local0.debug -t hast "Unable to change role to primary for resource ${disk}." exit 1 fi done # Wait for the /dev/hast/* devices to appear for disk in ${resources}; do for I in $( jot 60 ); do [ -c "/dev/hast/${disk}" ] && break sleep 0.5 done if [ ! -c "/dev/hast/${disk}" ]; then logger -p local0.debug -t hast "GEOM provider /dev/hast/${disk} did not appear." exit 1 fi done logger -p local0.debug -t hast "Role for HAST resources ${resources} switched to primary." # Import the ZFS pool; has to be done forcibly due to hostid issues zpool import -f -d /dev/hast ${pool} 2>&1 if [ $? -ne 0 ]; then logger -p local0.debug -t hast "ZFS pool import for ${hapool} failed." exit 1 fi logger -p local0.debug -t hast "ZFS pool ${pool} imported." ;; slave) logger -p local0.debug -t hast "Switching to secondary provider for ${resources}." # Export the ZFS pool; has to be done forcibly in case the hast resources have already switched zpool export -f ${pool} 2>&1 if [ $? -ne 0 ]; then logger -p local0.debug -t hast "Unable to export the pool ${pool}." exit 1 fi # Switch roles for the HAST resources for disk in ${resources}; do hastctl role secondary ${disk} 2>&1 if [ $? -ne 0 ]; then logger -p local0.debug -t hast "Unable to switch role to secondary for resource ${disk}." exit 1 fi logger -p local0.debug -t hast "Role switched to secondary for resource ${disk}." done ;; esac -- Freddie Cash fjwcash@gmail.com