From owner-freebsd-net@FreeBSD.ORG Mon Aug 8 15:12:08 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 78C721065673 for ; Mon, 8 Aug 2011 15:12:07 +0000 (UTC) (envelope-from ferdinand.goldmann@jku.at) Received: from emailsecure.uni-linz.ac.at (emailsecure.uni-linz.ac.at [140.78.3.66]) by mx1.freebsd.org (Postfix) with ESMTP id 453928FC14 for ; Mon, 8 Aug 2011 15:12:07 +0000 (UTC) Received: from From: Ferdinand Goldmann Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Mon, 8 Aug 2011 16:54:10 +0200 Message-Id: To: freebsd-net@freebsd.org Mime-Version: 1.0 (Apple Message framework v1084) X-Mailer: Apple Mail (2.1084) Subject: Problem using CARP + HAST ... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2011 15:12:09 -0000 Hi! I am trying to create a common resource pool for a certain application = using CARP/HAST as described in [1]. However while testing my setup I ran into = a problem which I don't know how to fix or work around: If I shut down only the carp interface on the master (ifconfig carp0 = down), the slave will take note of this, make his carp interface the master and mount the HAST storage using a script called by devd. Everything fine so = far. BUT: If, however, I completely shut down the masters network connection = (using "shut" on the switchport), the carp interface on the slave will still switch to = master.=20 But the script for making the HAST storage primary will just hang = forever: root 46841 0.0 0.6 3628 1524 ?? S 4:21PM 0:00.08 /bin/sh = /opt/bin/carp-hast-switch master root 47043 0.0 2.6 42228 6580 ?? S 4:22PM 0:00.03 hastd: = hast0 (secondary) (hastd) Seemingly, this is because the hastd daemons on master and slave are = unable to=20 communicate. So the script waits forever for the secondary device to go = away... : # Wait for any "hastd secondary" processes to stop for disk in ${resources}; do while $( pgrep -lf "hastd: ${disk} \(secondary\)" > /dev/null = 2>&1 ); do sleep 1 done Im a bit puzzled. Is there a way for hastd to make himself the master in = case of a timeout or such? Because in normal operation, whenever the carp interface fails, = the underlying=20 infrastructure will most likely be down as well. Even if I'd connect the two machines over an extra port for hastd, this problem would still occur if I pull the plug on the master. I = suppose the slave making himself the master will lead to a split-brain condition ... = but is there any way for hastd to handle this automagically? Because otherwise, it = won't be much good for a scenario like the above. :-/ Can anybody shed light on this please? TIA & best regards, Ferdinand [1] http://www.freebsd.org/doc/en/books/handbook/disks-hast.html=