From owner-freebsd-fs@freebsd.org  Thu Jun 30 15:28:49 2016
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 43300B87EEB
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 30 Jun 2016 15:28:49 +0000 (UTC)
 (envelope-from ben.rubson@gmail.com)
Received: from mail-wm0-x229.google.com (mail-wm0-x229.google.com
 [IPv6:2a00:1450:400c:c09::229])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id D55862617
 for <freebsd-fs@freebsd.org>; Thu, 30 Jun 2016 15:28:48 +0000 (UTC)
 (envelope-from ben.rubson@gmail.com)
Received: by mail-wm0-x229.google.com with SMTP id v199so226259479wmv.0
 for <freebsd-fs@freebsd.org>; Thu, 30 Jun 2016 08:28:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:subject:from:in-reply-to:date
 :content-transfer-encoding:message-id:references:to;
 bh=7JV/OP20p6JPSBt+ShievEbEM6+MLKnrZ/9aJ85vFZE=;
 b=puQVhO9nlCO0BJFSlm/bcmWr+bxN4e2ai8JgLTO9K+x6TvwDS0QCUNa9LAMrsBwY88
 cdctBY52Rg8aOP30hLs+30bGkgZI2lJQXHFzwHvu3F8mmo9PsAMR5tcoBr1M1d44287f
 AAhI0fzHO4mTx5T26FenlfYXk5+TWk67gpMVjMF/VGwJ+SOZskKejcovkimsXXAY2UVg
 eAGBOZtSjpXKlStcg5qZ+k3prQw1MZzU3qX/QDMMgUU5edlUCYhMqqxwK3dYSsITDS3q
 bcOzXdynWSxMWhsWl0KK0qGklM+lNIJSJJmNdkodC5cFyaPpjxoewcfHfnLi3pjmwQZF
 2G7Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:subject:from:in-reply-to:date
 :content-transfer-encoding:message-id:references:to;
 bh=7JV/OP20p6JPSBt+ShievEbEM6+MLKnrZ/9aJ85vFZE=;
 b=MP6Dbuyz4vg6+MkxzxeW862rpXz67QrJ4dagldkL8kq+jjG62/KCDwdoF8z4SbvlvN
 LZH8XR6V40CLfMl19ofd46n492Z0AnbcytrXFFouX+g0yV45VaxEqEuizuBQOus/llr3
 +OBJEPtBmEtwMEQk6zciIXjJNwtjqvZYpyh4C299RGxcELaIzMqt/nTr2ZGf1TQxBwBN
 Acv0gVn1LlCu/NewdyHbZeLR82gF7ecPHh+B/OLI22Y6Wj+l4v8KN39Q8w0G94Xx5CEk
 8q93OwvyOOFyOw7y62Z2yU6JzlfXoctUBU/AiZZvoS894Hr/70NdejXf+kOfUcoPMycl
 V8MA==
X-Gm-Message-State: ALyK8tLtbRp02dL/P+Ei2SiOJRQmQv9PDSv96ialMk44VECulTQ2R6DG83ne4oMGc/Z5qw==
X-Received: by 10.28.169.66 with SMTP id s63mr28689054wme.87.1467300525399;
 Thu, 30 Jun 2016 08:28:45 -0700 (PDT)
Received: from macbook-air-de-benjamin-1.home
 (LFbn-1-7077-85.w90-116.abo.wanadoo.fr. [90.116.246.85])
 by smtp.gmail.com with ESMTPSA id m125sm9950726wmm.8.2016.06.30.08.28.44
 for <freebsd-fs@freebsd.org>
 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
 Thu, 30 Jun 2016 08:28:44 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
Subject: Re: HAST + ZFS + NFS + CARP
From: Ben RUBSON <ben.rubson@gmail.com>
In-Reply-To: <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com>
Date: Thu, 30 Jun 2016 17:28:41 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <AD42D8FD-D07B-454E-B79D-028C1EC57381@gmail.com>
References: <20160630144546.GB99997@mordor.lan>
 <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com>
To: freebsd-fs@freebsd.org
X-Mailer: Apple Mail (2.3124)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jun 2016 15:28:49 -0000


> On 30 Jun 2016, at 17:14, InterNetX - Juergen Gotteswinter =
<jg@internetx.com> wrote:
>=20
>=20
>=20
> Am 30.06.2016 um 16:45 schrieb Julien Cigar:
>> Hello,
>>=20
>> I'm always in the process of setting a redundant low-cost storage for=20=

>> our (small, ~30 people) team here.
>>=20
>> I read quite a lot of articles/documentations/etc and I plan to use =
HAST
>> with ZFS for the storage, CARP for the failover and the "good old =
NFS"
>> to mount the shares on the clients.
>>=20
>> The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for =
the
>> shared storage.
>>=20
>> Assuming the following configuration:
>> - MASTER is the active node and BACKUP is the standby node.
>> - two disks in each machine: ada0 and ada1.
>> - two interfaces in each machine: em0 and em1
>> - em0 is the primary interface (with CARP setup)
>> - em1 is dedicated to the HAST traffic (crossover cable)
>> - FreeBSD is properly installed in each machine.
>> - a HAST resource "disk0" for ada0p2.
>> - a HAST resource "disk1" for ada1p2.
>> - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is =
created
>>  on MASTER
>>=20
>> A couple of questions I am still wondering:
>> - If a disk dies on the MASTER I guess that zpool will not see it and
>>  will transparently use the one on BACKUP through the HAST =
ressource..
>=20
> thats right, as long as writes on $anything have been successful hast =
is
> happy and wont start whining
>=20
>>  is it a problem?=20
>=20
> imho yes, at least from management view
>=20
>> could this lead to some corruption?
>=20
> probably, i never heard about anyone who uses that for long time in
> production
>=20
> At this stage the
>>  common sense would be to replace the disk quickly, but imagine the
>>  worst case scenario where ada1 on MASTER dies, zpool will not see it=20=

>>  and will transparently use the one from the BACKUP node (through the=20=

>>  "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not=20=

>>  see it and will transparently use the one from the BACKUP node=20
>>  (through the "disk0" HAST ressource). At this point on MASTER the =
two=20
>>  disks are broken but the pool is still considered healthy ... What =
if=20
>>  after that we unplug the em0 network cable on BACKUP? Storage is
>>  down..
>> - Under heavy I/O the MASTER box suddently dies (for some reasons),=20=

>>  thanks to CARP the BACKUP node will switch from standy -> active and=20=

>>  execute the failover script which does some "hastctl role primary" =
for
>>  the ressources and a zpool import. I wondered if there are any
>>  situations where the pool couldn't be imported (=3D data =
corruption)?
>>  For example what if the pool hasn't been exported on the MASTER =
before
>>  it dies?
>> - Is it a problem if the NFS daemons are started at boot on the =
standby
>>  node, or should they only be started in the failover script? What
>>  about stale files and active connections on the clients?
>=20
> sometimes stale mounts recover, sometimes not, sometimes clients need
> even reboots
>=20
>> - A catastrophic power failure occur and MASTER and BACKUP are =
suddently
>>  powered down. Later the power returns, is it possible that some
>>  problem occur (split-brain scenario ?) regarding the order in which =
the
>=20
> sure, you need an exact procedure to recover
>=20
>>  two machines boot up?
>=20
> best practice should be to keep everything down after boot
>=20
>> - Other things I have not thought?
>>=20
>=20
>=20
>=20
>> Thanks!
>> Julien
>>=20
>=20
>=20
> imho:
>=20
> leave hast where it is, go for zfs replication. will save your butt,
> sooner or later if you avoid this fragile combination

I was also replying, and finishing by this :
Why don't you set your slave as an iSCSI target and simply do ZFS =
mirroring ?
ZFS would then know as soon as a disk is failing.
And if the master fails, you only have to import (-f certainly, in case =
of a master power failure) on the slave.

Ben