From owner-freebsd-cluster@FreeBSD.ORG Sun Feb 20 14:06:24 2011 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B5F401065672 for ; Sun, 20 Feb 2011 14:06:24 +0000 (UTC) (envelope-from josh@tcbug.org) Received: from out1.smtp.messagingengine.com (out1.smtp.messagingengine.com [66.111.4.25]) by mx1.freebsd.org (Postfix) with ESMTP id 83BA08FC23 for ; Sun, 20 Feb 2011 14:06:24 +0000 (UTC) Received: from compute3.internal (compute3.nyi.mail.srv.osa [10.202.2.43]) by gateway1.messagingengine.com (Postfix) with ESMTP id 571D6208C2; Sun, 20 Feb 2011 08:49:39 -0500 (EST) Received: from frontend1.messagingengine.com ([10.202.2.160]) by compute3.internal (MEProxy); Sun, 20 Feb 2011 08:49:39 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=messagingengine.com; h=references:in-reply-to:mime-version:content-transfer-encoding:content-type:message-id:cc:from:subject:date:to; s=smtpout; bh=Mm93t/9rSp1E1CCgFn18lfgzCa8=; b=F/b1dRRO/o1kzNh8V71T+YAHNuyN01ax1U5wzm6Yj+XCDFJTUA19Oa0d36lgbb69qX3ADrLIzrvv75Fp+SAtpZQ+hswzijHKN/T4C01frnA7WaVUkV4AKqXxUJ2qjHrUK7TM+HlqknA5Fc5lX7Tqmujw2cOnn30WyCKvAlemypk= X-Sasl-enc: PiXrHOKrLDhPB7/o4K+EklPSZp0NvvELU00vz2tmeb7x 1298209778 Received: from [10.10.1.146] (74-34-19-98.dr01.rsmt.mn.frontiernet.net [74.34.19.98]) by mail.messagingengine.com (Postfix) with ESMTPSA id CAAEC40BFD5; Sun, 20 Feb 2011 08:49:38 -0500 (EST) References: <1298020090.18890.1684.camel@pcdenny> In-Reply-To: Mime-Version: 1.0 (iPhone Mail 8C148a) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Message-Id: <22218C35-7CDE-4E6C-9C4B-F0F10A8B15AC@tcbug.org> X-Mailer: iPhone Mail (8C148a) From: Josh Paetzel Date: Sun, 20 Feb 2011 07:49:34 -0600 To: Denny Schierz Cc: "freebsd-cluster@freebsd.org" Subject: Re: Build failover ZFS, like HA-Storage from Solaris X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Feb 2011 14:06:24 -0000 On Feb 20, 2011, at 4:59 AM, Denny Schierz wrote: > hi, >=20 > Am 19.02.2011 um 02:39 schrieb Freddie Cash: >=20 >> And devd provides >> the hooks into your custom scripts so that when CARP switches from >> node 1 to node 2, you export the pool on node 1, and import the pool >> on node 2. >=20 > but how will I take care, that I don't get a split brain? Or do I think th= e right way, if I say "Only where the carp IP is active, that node has the f= orce to import ZFS?" But what happens, if through a power cut both nodes are= power on the same time? I miss something like a quorum device or=20 At boot carp devices have a delay that you manually set. If both machines ar= e powered on at the same time that mechanism prevents both heads asserting c= arp MASTER. Of course it's imperfect and a staggered power on can defeat the= delay. In practice, it's pretty rare. Now what can make carp lose it's mind= is that it uses the interface config for a checksum. If the interface confi= g differs both sides go MASTER. At that point you start getting 50% of your I= P traffic to each host, as the MAC address in the switch flaps, and so forth= . Your scripts probably need to down the CARP device if the ZFS import fail= s. =20 The reality of two node HA is that split brain is an unavoidable issue. Anci= ent sailors knew this when they needed precise timekeeping for navigation. T= ake one clock to sea or three. If you have two clocks and they disagree... In practice most of the things that cause split brain to happen would cause i= ssues even if the rig didn't split brain.=20 Failover while there are active writes is far more of an issue than split br= ain... Thanks, Josh (been there, done that) Paetzel=