From owner-freebsd-fs@freebsd.org  Thu Jun 30 15:24:11 2016
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B199BB87D94
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 30 Jun 2016 15:24:11 +0000 (UTC)
 (envelope-from jg@internetx.com)
Received: from mx1.internetx.com (mx1.internetx.com [62.116.129.39])
 by mx1.freebsd.org (Postfix) with ESMTP id 45753227A
 for <freebsd-fs@freebsd.org>; Thu, 30 Jun 2016 15:24:10 +0000 (UTC)
 (envelope-from jg@internetx.com)
Received: from localhost (localhost [127.0.0.1])
 by mx1.internetx.com (Postfix) with ESMTP id 1C7184C4C84C;
 Thu, 30 Jun 2016 17:14:13 +0200 (CEST)
X-Virus-Scanned: InterNetX GmbH amavisd-new at ix-mailer.internetx.de
Received: from mx1.internetx.com ([62.116.129.39])
 by localhost (ix-mailer.internetx.de [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id hzs+ZxqUWwOi; Thu, 30 Jun 2016 17:14:11 +0200 (CEST)
Received: from [192.168.100.26] (pizza.internetx.de [62.116.129.3])
 (using TLSv1 with cipher AES128-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.internetx.com (Postfix) with ESMTPSA id EB0C445FC0E4;
 Thu, 30 Jun 2016 17:14:10 +0200 (CEST)
Subject: Re: HAST + ZFS + NFS + CARP
References: <20160630144546.GB99997@mordor.lan>
To: Julien Cigar <julien@perdition.city>, freebsd-fs@freebsd.org
Reply-To: jg@internetx.com
From: InterNetX - Juergen Gotteswinter <jg@internetx.com>
Message-ID: <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com>
Date: Thu, 30 Jun 2016 17:14:08 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.1.1
MIME-Version: 1.0
In-Reply-To: <20160630144546.GB99997@mordor.lan>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jun 2016 15:24:11 -0000


Am 30.06.2016 um 16:45 schrieb Julien Cigar:
> Hello,
> 
> I'm always in the process of setting a redundant low-cost storage for 
> our (small, ~30 people) team here.
> 
> I read quite a lot of articles/documentations/etc and I plan to use HAST
> with ZFS for the storage, CARP for the failover and the "good old NFS"
> to mount the shares on the clients.
> 
> The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for the
> shared storage.
> 
> Assuming the following configuration:
> - MASTER is the active node and BACKUP is the standby node.
> - two disks in each machine: ada0 and ada1.
> - two interfaces in each machine: em0 and em1
> - em0 is the primary interface (with CARP setup)
> - em1 is dedicated to the HAST traffic (crossover cable)
> - FreeBSD is properly installed in each machine.
> - a HAST resource "disk0" for ada0p2.
> - a HAST resource "disk1" for ada1p2.
> - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is created
>   on MASTER
> 
> A couple of questions I am still wondering:
> - If a disk dies on the MASTER I guess that zpool will not see it and
>   will transparently use the one on BACKUP through the HAST ressource..

thats right, as long as writes on $anything have been successful hast is
happy and wont start whining

>   is it a problem? 

imho yes, at least from management view

> could this lead to some corruption?

probably, i never heard about anyone who uses that for long time in
production

 At this stage the
>   common sense would be to replace the disk quickly, but imagine the
>   worst case scenario where ada1 on MASTER dies, zpool will not see it 
>   and will transparently use the one from the BACKUP node (through the 
>   "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not 
>   see it and will transparently use the one from the BACKUP node 
>   (through the "disk0" HAST ressource). At this point on MASTER the two 
>   disks are broken but the pool is still considered healthy ... What if 
>   after that we unplug the em0 network cable on BACKUP? Storage is
>   down..
> - Under heavy I/O the MASTER box suddently dies (for some reasons), 
>   thanks to CARP the BACKUP node will switch from standy -> active and 
>   execute the failover script which does some "hastctl role primary" for
>   the ressources and a zpool import. I wondered if there are any
>   situations where the pool couldn't be imported (= data corruption)?
>   For example what if the pool hasn't been exported on the MASTER before
>   it dies?
> - Is it a problem if the NFS daemons are started at boot on the standby
>   node, or should they only be started in the failover script? What
>   about stale files and active connections on the clients?

sometimes stale mounts recover, sometimes not, sometimes clients need
even reboots

> - A catastrophic power failure occur and MASTER and BACKUP are suddently
>   powered down. Later the power returns, is it possible that some
>   problem occur (split-brain scenario ?) regarding the order in which the

sure, you need an exact procedure to recover

>   two machines boot up?

best practice should be to keep everything down after boot

> - Other things I have not thought?
> 


> Thanks!
> Julien
> 


imho:

leave hast where it is, go for zfs replication. will save your butt,
sooner or later if you avoid this fragile combination