From owner-freebsd-fs@FreeBSD.ORG Thu Mar 4 00:31:45 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0CC13106566B for ; Thu, 4 Mar 2010 00:31:45 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id B20CC8FC12 for ; Thu, 4 Mar 2010 00:31:44 +0000 (UTC) Received: by gya1 with SMTP id 1so1079894gya.13 for ; Wed, 03 Mar 2010 16:31:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=/gdTHo0wmp0Pgtb90127h+lChjvd4iFkY7cAtyb1dPg=; b=XFHwWrLIF3sUtv4/rr9QTp2xXTGWJMTQdf94u/28R157NNqgcgAv3gV6EC4UzeqbSA c3GYGlhEBOlEG5EKXirwlapedDRuDRbI0UWrMMCsvc9BNpaL5LrvzwUoopZO844Jo2in qe59U19WKMO1tLu7LMnVPYocS49KwxPtmWFME= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=MUbYCJwP8z4UfYnw2kq9VkYksB/A/E1Dt4ESrvoDlsBfEvqt8c48QgEH26yX61rDqq cw7jb/tHyrc+g1SDb4x2STyg7yJfvrl8f0lJ8XQ8+ByXJF3k9r+QhLsS6fdCnpnNWz6H iEhgONJYhqeWlllBvGEvnl7hDTrZnzP0kLbbA= MIME-Version: 1.0 Received: by 10.91.98.18 with SMTP id a18mr2062075agm.55.1267662700642; Wed, 03 Mar 2010 16:31:40 -0800 (PST) Date: Wed, 3 Mar 2010 16:31:40 -0800 Message-ID: From: Freddie Cash To: fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: HAST: split-brain -- how to force one side to become primary? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Mar 2010 00:31:45 -0000 According to the wiki, when a split-brain situation arises, I should be able to stop hastd on one side, write changes to the /dev/hast/* providers on the primary to increment the localcnt valye, and then bring up the secondary hastd. The locacnt/remotecnt values will be different, and everything will start to re-sync. However, this doesn't seem to work. Or, maybe I'm not doing things right to make it work. Or maybe I've completely misunderstood how it all works. (Nah, that can never happen. roll-eyes) :) /dev/hast/* is used to form a raidz1 vdev as part of pool "hapool". There's a single 1 GB zvol created, that is exported via iSCSI (net/istgt). I can mount the iSCSI disk on a Linux client, partition it, format it using XFS, and write data to it. Using only hast2 node as primary, I've written out 10 MB of new data, and verified that the data is there via "zfs list" on hast2, and multiple mount/unmount cycles on the client. Yet localcnt never increments beyond 1 (remotecnt is 0). Is there a way to forcibly increment localcnt on one node, so that bringing up hastd on the other node will correctly come up as secondary, and start a sync? Or, do I have to manually re-do the HAST setup on one side? Or zero out the base/physical disk underneath HAST? /etc/hast.conf (only listen line is different between nodes): # Global section control /var/run/hastctl listen 172.20.0.1 replication memsync # Resource section resource disk01 { on hast1 { local /dev/label/disk01 remote 172.20.0.2 } on hast2 { local /dev/label/disk01 remote 172.20.0.1 } } resource disk02 { on hast1 { local /dev/label/disk02 remote 172.20.0.2 } on hast2 { local /dev/label/disk02 remote 172.20.0.1 } } resource disk03 { on hast1 { local /dev/label/disk03 remote 172.20.0.2 } on hast2 { local /dev/label/disk03 remote 172.20.0.1 } } resource disk04 { on hast1 { local /dev/label/disk04 remote 172.20.0.2 } on hast2 { local /dev/label/disk04 remote 172.20.0.1 } } hastctl dump on hast1: resource: disk01 datasize: 2147478528 extentsize: 2097152 keepdirty: 64 localoff: 4608 resuid: 1224151284752404553 localcnt: 1 remotecnt: 0 prevrole: primary resource: disk02 datasize: 2147478528 extentsize: 2097152 keepdirty: 64 localoff: 4608 resuid: 10884849062207686761 localcnt: 1 remotecnt: 0 prevrole: primary resource: disk03 datasize: 2147478528 extentsize: 2097152 keepdirty: 64 localoff: 4608 resuid: 14443609578994823508 localcnt: 1 remotecnt: 0 prevrole: primary resource: disk04 datasize: 2147478528 extentsize: 2097152 keepdirty: 64 localoff: 4608 resuid: 1365498106518463540 localcnt: 1 remotecnt: 0 prevrole: primary hastctl dump on hast2: resource: disk01 datasize: 2147478528 extentsize: 2097152 keepdirty: 64 localoff: 4608 resuid: 1224151284752404553 localcnt: 1 remotecnt: 0 prevrole: primary resource: disk02 datasize: 2147478528 extentsize: 2097152 keepdirty: 64 localoff: 4608 resuid: 10884849062207686761 localcnt: 1 remotecnt: 0 prevrole: primary resource: disk03 datasize: 2147478528 extentsize: 2097152 keepdirty: 64 localoff: 4608 resuid: 14443609578994823508 localcnt: 1 remotecnt: 0 prevrole: primary resource: disk04 datasize: 2147478528 extentsize: 2097152 keepdirty: 64 localoff: 4608 resuid: 1365498106518463540 localcnt: 1 remotecnt: 0 prevrole: primary -- Freddie Cash fjwcash@gmail.com