From owner-freebsd-cluster@FreeBSD.ORG  Sun Feb 20 14:06:24 2011
Return-Path: <owner-freebsd-cluster@FreeBSD.ORG>
Delivered-To: freebsd-cluster@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B5F401065672
	for <freebsd-cluster@freebsd.org>; Sun, 20 Feb 2011 14:06:24 +0000 (UTC)
	(envelope-from josh@tcbug.org)
Received: from out1.smtp.messagingengine.com (out1.smtp.messagingengine.com
	[66.111.4.25]) by mx1.freebsd.org (Postfix) with ESMTP id 83BA08FC23
	for <freebsd-cluster@freebsd.org>; Sun, 20 Feb 2011 14:06:24 +0000 (UTC)
Received: from compute3.internal (compute3.nyi.mail.srv.osa [10.202.2.43])
	by gateway1.messagingengine.com (Postfix) with ESMTP id 571D6208C2;
	Sun, 20 Feb 2011 08:49:39 -0500 (EST)
Received: from frontend1.messagingengine.com ([10.202.2.160])
	by compute3.internal (MEProxy); Sun, 20 Feb 2011 08:49:39 -0500
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=messagingengine.com;
	h=references:in-reply-to:mime-version:content-transfer-encoding:content-type:message-id:cc:from:subject:date:to;
	s=smtpout; bh=Mm93t/9rSp1E1CCgFn18lfgzCa8=;
	b=F/b1dRRO/o1kzNh8V71T+YAHNuyN01ax1U5wzm6Yj+XCDFJTUA19Oa0d36lgbb69qX3ADrLIzrvv75Fp+SAtpZQ+hswzijHKN/T4C01frnA7WaVUkV4AKqXxUJ2qjHrUK7TM+HlqknA5Fc5lX7Tqmujw2cOnn30WyCKvAlemypk=
X-Sasl-enc: PiXrHOKrLDhPB7/o4K+EklPSZp0NvvELU00vz2tmeb7x 1298209778
Received: from [10.10.1.146] (74-34-19-98.dr01.rsmt.mn.frontiernet.net
	[74.34.19.98])
	by mail.messagingengine.com (Postfix) with ESMTPSA id CAAEC40BFD5;
	Sun, 20 Feb 2011 08:49:38 -0500 (EST)
References: <1298020090.18890.1684.camel@pcdenny>
	<AANLkTi=LNUWCpQ4XsLxYPomRsb3GC0oUrZuvKTyGxqTQ@mail.gmail.com>
	<BEB41E6D-D44E-4E9B-A176-EE2EBF63B099@4lin.net>
	<AANLkTimi=mJby_g3_xFn-C1XeUdzq31Mt5-oT6ic+vgL@mail.gmail.com>
	<AC77D3BF-7F15-4DA6-83D9-9AE47AB65BFE@4lin.net>
In-Reply-To: <AC77D3BF-7F15-4DA6-83D9-9AE47AB65BFE@4lin.net>
Mime-Version: 1.0 (iPhone Mail 8C148a)
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii
Message-Id: <22218C35-7CDE-4E6C-9C4B-F0F10A8B15AC@tcbug.org>
X-Mailer: iPhone Mail (8C148a)
From: Josh Paetzel <josh@tcbug.org>
Date: Sun, 20 Feb 2011 07:49:34 -0600
To: Denny Schierz <linuxmail@4lin.net>
Cc: "freebsd-cluster@freebsd.org" <freebsd-cluster@freebsd.org>
Subject: Re: Build failover ZFS, like HA-Storage from Solaris
X-BeenThere: freebsd-cluster@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Clustering FreeBSD <freebsd-cluster.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-cluster>, 
	<mailto:freebsd-cluster-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-cluster>
List-Post: <mailto:freebsd-cluster@freebsd.org>
List-Help: <mailto:freebsd-cluster-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-cluster>,
	<mailto:freebsd-cluster-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 20 Feb 2011 14:06:24 -0000

On Feb 20, 2011, at 4:59 AM, Denny Schierz <linuxmail@4lin.net> wrote:

> hi,
>=20
> Am 19.02.2011 um 02:39 schrieb Freddie Cash:
>=20
>> And devd provides
>> the hooks into your custom scripts so that when CARP switches from
>> node 1 to node 2, you export the pool on node 1, and import the pool
>> on node 2.
>=20
> but how will I take care, that I don't get a split brain? Or do I think th=
e right way, if I say "Only where the carp IP is active, that node has the f=
orce to import ZFS?" But what happens, if through a power cut both nodes are=
 power on the same time? I miss something like a quorum device or=20

At boot carp devices have a delay that you manually set. If both machines ar=
e powered on at the same time that mechanism prevents both heads asserting c=
arp MASTER. Of course it's imperfect and a staggered power on can defeat the=
 delay. In practice, it's pretty rare. Now what can make carp lose it's mind=
 is that it uses the interface config for a checksum. If the interface confi=
g differs both sides go MASTER. At that point you start getting 50% of your I=
P traffic to each host, as the MAC address in the switch flaps, and so forth=
.  Your scripts probably need to down the CARP device if the ZFS import fail=
s. =20

The reality of two node HA is that split brain is an unavoidable issue. Anci=
ent sailors knew this when they needed precise timekeeping for navigation.  T=
ake one clock to sea or three. If you have two clocks and they disagree...

In practice most of the things that cause split brain to happen would cause i=
ssues even if the rig didn't split brain.=20

Failover while there are active writes is far more of an issue than split br=
ain...

Thanks,

Josh (been there, done that) Paetzel=