From owner-freebsd-fs@FreeBSD.ORG Fri Jul 2 07:59:35 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3C65C106566B; Fri, 2 Jul 2010 07:59:35 +0000 (UTC) (envelope-from hiroshi@soupacific.com) Received: from mail.soupacific.com (mail.soupacific.com [211.19.53.201]) by mx1.freebsd.org (Postfix) with ESMTP id 0842C8FC1A; Fri, 2 Jul 2010 07:59:34 +0000 (UTC) Received: from [127.0.0.1] (unknown [192.168.1.239]) by mail.soupacific.com (Postfix) with ESMTP id CA6336EA78; Fri, 2 Jul 2010 07:51:28 +0000 (UTC) Message-ID: <4C2D9C62.4050105@soupacific.com> Date: Fri, 02 Jul 2010 16:59:30 +0900 From: "hiroshi@soupacific.com" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: Mikolaj Golub References: <4C139F9C.2090305@soupacific.com><86iq5oc82y.fsf@kopusha.home.net> <4C14215D.9090304@soupacific.com><20100613003635.GA60012@icarus.home.lan><20100613074921.GB1320@garage.freebsd.pl><4C149A5C.3070401@soupacific.com><20100613102401.GE1320@garage.freebsd.pl><86eigavzsg.fsf@kopusha.home.net><20100614095044.GH1721@garage.freebsd.pl><868w6hwt2w.fsf@kopusha.home.net><20100614153746.GN1721@garage.freebsd.pl><86zkyxvc4v.fsf@kopusha.home.net> <4C2C43D5.1080907@soupacific.com><86mxubndrp.fsf@kopusha.home.net> <4C2D7615.5070606@soupacific.com> <861vbm1hpr.fsf@zhuzha.ua1> In-Reply-To: <861vbm1hpr.fsf@zhuzha.ua1> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST and CARP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jul 2010 07:59:35 -0000 On 7/2/2010 4:11 PM, Mikolaj Golub wrote: > > So you have: > > secondary localcnt: 1 > secondary remotecnt: 0 > primary localcnt: 1 > primary remotecnt: 0 > > This is a split-brain condition as described on wiki: primary's localcnt is > greater than secondary's remotecnt (primary [fw01A] was modified while fw01B > wasn't watching) and secondary's localcnt is greater than primary's remotecnt > (fw01B was modified while fw01A wasn't watching). So hasctl role secondary xxx does not change cnt values ? Scenario is this ServerA failed, then ServerB became MASTER. Only ServerA is started(say after fixed something) , both servers are connected,then ServerB starts, BUT during failure of ServerA, ServerB was MASTER. ServerA was started before ServerB is started, thus ServerA should be MASTER! On this situation, CARP will set ServerA is MASTER and late comer ServerB is set as BACKUP by CARP. hastctl role secondary xxx set > secondary localcnt: 1 > secondary remotecnt: 0 > primary localcnt: 1 > primary remotecnt: 0 above values to NOT split-brain. It sounds more favoritabel way ???? hastctl role is managed by ifstated watching CARP status. Is this strange idea ? Thanks Hiroshi > > h> Hope this logs can help you ! If you need to make me debug bit more, > h> give me some idea to check! > > Actually the logs you have provided are not very interesting as they shows the > state after bad things happened. It is more interesting to look at the logs > (both hosts) before split brain. > > I would recommend: > > 1) Configure hast manually and ensure that both primary and secondary function > properly and data are synchronized between the nodes. Also make sure the clock > on both hosts is in sync (needed when comparing logs). > > 2) Reboot both servers so your carp/hast setup auto starts and see what > happens. > > 3) If it sets primary and secondary automatically and status is ok on both > nodes initiate switching to failover. > > 4) If after switching (or earlier) split brain is detected, provide logs from > both nodes since hosts reboot. >