From owner-freebsd-fs@FreeBSD.ORG Fri Jul 2 09:25:25 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ACB77106564A; Fri, 2 Jul 2010 09:25:25 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id E6F598FC14; Fri, 2 Jul 2010 09:25:24 +0000 (UTC) Received: by wyb34 with SMTP id 34so2054764wyb.13 for ; Fri, 02 Jul 2010 02:25:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject :organization:references:date:in-reply-to:message-id:user-agent :mime-version:content-type; bh=gHAWtdlf4lv+jSmTEIDT4ON5ci/OTzKPqth0e7/EAPo=; b=UUZbvEQn/w/2zJbV45mQpRfV+E+Ox7qLeBKHETD2nL4bgoaGkWiqdvTy1tXUIF4voH GGnEBDfQgE1XzRpfdiobXV8IGl319Z7d9u2r3+yNWa19AhUgRqG/hndpDmni9dbLzOMC J0zeKDP2zo+0NmZK1O32CdoZguWE4Um6hqCTU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:organization:references:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=EUN03vvyC3/f+24EBRcNMKbPisSmR0UlDfnO+sWBn3YWrACRaqpG6wSk85E4GJRz4e HptpC3QtEkP5jncberU9CEf1HSgsZ5qRLZEPvBJrz/VkyO4Afs92YDQJpT10zEm41+4d RkMOh8l3ZkdeQv7loZmADE0cG6j6/cXTGDWik= Received: by 10.227.137.81 with SMTP id v17mr259571wbt.128.1278062720635; Fri, 02 Jul 2010 02:25:20 -0700 (PDT) Received: from localhost (ua1.etadirect.net [91.198.140.16]) by mx.google.com with ESMTPS id i25sm3245529wbi.4.2010.07.02.02.25.18 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 02 Jul 2010 02:25:19 -0700 (PDT) From: Mikolaj Golub To: "hiroshi\@soupacific.com" Organization: TOA Ukraine References: <4C139F9C.2090305@soupacific.com> <86iq5oc82y.fsf@kopusha.home.net> <4C14215D.9090304@soupacific.com> <20100613003635.GA60012@icarus.home.lan> <20100613074921.GB1320@garage.freebsd.pl> <4C149A5C.3070401@soupacific.com> <20100613102401.GE1320@garage.freebsd.pl> <86eigavzsg.fsf@kopusha.home.net> <20100614095044.GH1721@garage.freebsd.pl> <868w6hwt2w.fsf@kopusha.home.net> <20100614153746.GN1721@garage.freebsd.pl> <86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com> <86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com> <861vbm1hpr.fsf@zhuzha.ua1> <4C2D9C62.4050105@soupacific.com> Date: Fri, 02 Jul 2010 12:25:16 +0300 In-Reply-To: <4C2D9C62.4050105@soupacific.com> (hiroshi@soupacific.com's message of "Fri, 02 Jul 2010 16:59:30 +0900") Message-ID: <86wrtez14z.fsf@zhuzha.ua1> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jul 2010 09:25:25 -0000 On Fri, 02 Jul 2010 16:59:30 +0900 hiroshi@soupacific.com wrote: h> On 7/2/2010 4:11 PM, Mikolaj Golub wrote: >> >> So you have: >> >> secondary localcnt: 1 >> secondary remotecnt: 0 >> primary localcnt: 1 >> primary remotecnt: 0 >> >> This is a split-brain condition as described on wiki: primary's localcnt is >> greater than secondary's remotecnt (primary [fw01A] was modified while fw01B >> wasn't watching) and secondary's localcnt is greater than primary's remotecnt >> (fw01B was modified while fw01A wasn't watching). h> So hasctl role secondary xxx does not change cnt values ? Every node actually has only two values: localcnt and remotecnt. These values are kept in disk metadata. So if a node behaves as a secondary, it reads these values from the disk and stores in secondary_localcnt and secondary_remotecnt. Other two values (primary_localcnt, primary_remotecnt) are received from primary host. If I haven't overlooked something in the code secondary does not modify localcnt and remotecnt in metadata. These counters can be modified only when the node behaves as primary: on initialization primary sets them to localcnt=1 and remotecnt=0, then if data are synchronized between the nodes it sets the counters to the same values as on secondary. If primary can't send data to secondary it increases localcnt. So only primary can modify counters and if split-brain is detected that means that secondary in past was primary for some time and another node was not aware about this (or data was not synchronized). h> Scenario is this h> ServerA failed, then ServerB became MASTER. h> Only ServerA is started(say after fixed something) , both servers are h> connected,then ServerB starts, BUT during failure of ServerA, ServerB h> was MASTER. h> ServerA was started before ServerB is started, thus ServerA should be h> MASTER! h> On this situation, CARP will set ServerA is MASTER and late comer h> ServerB is set as BACKUP by CARP. h> hastctl role secondary xxx set >> secondary localcnt: 1 >> secondary remotecnt: 0 >> primary localcnt: 1 >> primary remotecnt: 0 h> above values to NOT split-brain. It sounds more favoritabel way ???? I am not sure I understand what you mean here :-) First of all you can't modify counters manually. This is maintained by hast internally. When ServerB has become master and modified some data, ServerA, before setting to primary again, should be set to secondary to synchronize all changes and only after this be switched to primary, otherwise you will have split-brain and should synchronize full storage recreating provider on secondary. -- Mikolaj Golub