From owner-freebsd-fs@freebsd.org  Thu Jun 30 16:35:47 2016
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CA14B873EA
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 30 Jun 2016 16:35:47 +0000 (UTC)
 (envelope-from julien@perdition.city)
Received: from relay-b03.edpnet.be (relay-b03.edpnet.be [212.71.1.220])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "edpnet.email",
 Issuer "Go Daddy Secure Certificate Authority - G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id B25A32959
 for <freebsd-fs@freebsd.org>; Thu, 30 Jun 2016 16:35:46 +0000 (UTC)
 (envelope-from julien@perdition.city)
X-ASG-Debug-ID: 1467304541-0a88181ce65b1c30001-3nHGF7
Received: from mordor.lan (213.219.165.225.bro01.dyn.edpnet.net
 [213.219.165.225]) by relay-b03.edpnet.be with ESMTP id BHJ5pFotM736xmfs
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Thu, 30 Jun 2016 18:35:43 +0200 (CEST)
X-Barracuda-Envelope-From: julien@perdition.city
X-Barracuda-Effective-Source-IP: 213.219.165.225.bro01.dyn.edpnet.net[213.219.165.225]
X-Barracuda-Apparent-Source-IP: 213.219.165.225
Date: Thu, 30 Jun 2016 18:35:41 +0200
From: Julien Cigar <julien@perdition.city>
To: Ben RUBSON <ben.rubson@gmail.com>
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS + NFS + CARP
Message-ID: <20160630163541.GC5695@mordor.lan>
X-ASG-Orig-Subj: Re: HAST + ZFS + NFS + CARP
References: <20160630144546.GB99997@mordor.lan>
 <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com>
 <AD42D8FD-D07B-454E-B79D-028C1EC57381@gmail.com>
 <20160630153747.GB5695@mordor.lan>
 <63C07474-BDD5-42AA-BF4A-85A0E04D3CC2@gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
 protocol="application/pgp-signature"; boundary="+xNpyl7Qekk2NvDX"
Content-Disposition: inline
In-Reply-To: <63C07474-BDD5-42AA-BF4A-85A0E04D3CC2@gmail.com>
User-Agent: Mutt/1.6.1 (2016-04-27)
X-Barracuda-Connect: 213.219.165.225.bro01.dyn.edpnet.net[213.219.165.225]
X-Barracuda-Start-Time: 1467304542
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://212.71.1.220:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 5518
X-Virus-Scanned: by bsmtpd at edpnet.be
X-Barracuda-BRTS-Status: 1
X-Barracuda-Bayes: INNOCENT GLOBAL 0.5000 1.0000 0.0100
X-Barracuda-Spam-Score: 0.01
X-Barracuda-Spam-Status: No, SCORE=0.01 using global scores of TAG_LEVEL=1000.0
 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.30900
 Rule breakdown below
 pts rule name              description
 ---- ---------------------- --------------------------------------------------
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jun 2016 16:35:47 -0000


--+xNpyl7Qekk2NvDX
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Jun 30, 2016 at 05:42:04PM +0200, Ben RUBSON wrote:
>=20
>=20
> > On 30 Jun 2016, at 17:37, Julien Cigar <julien@perdition.city> wrote:
> >=20
> >> On Thu, Jun 30, 2016 at 05:28:41PM +0200, Ben RUBSON wrote:
> >>=20
> >>> On 30 Jun 2016, at 17:14, InterNetX - Juergen Gotteswinter <jg@intern=
etx.com> wrote:
> >>>=20
> >>>=20
> >>>=20
> >>>> Am 30.06.2016 um 16:45 schrieb Julien Cigar:
> >>>> Hello,
> >>>>=20
> >>>> I'm always in the process of setting a redundant low-cost storage fo=
r=20
> >>>> our (small, ~30 people) team here.
> >>>>=20
> >>>> I read quite a lot of articles/documentations/etc and I plan to use =
HAST
> >>>> with ZFS for the storage, CARP for the failover and the "good old NF=
S"
> >>>> to mount the shares on the clients.
> >>>>=20
> >>>> The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for =
the
> >>>> shared storage.
> >>>>=20
> >>>> Assuming the following configuration:
> >>>> - MASTER is the active node and BACKUP is the standby node.
> >>>> - two disks in each machine: ada0 and ada1.
> >>>> - two interfaces in each machine: em0 and em1
> >>>> - em0 is the primary interface (with CARP setup)
> >>>> - em1 is dedicated to the HAST traffic (crossover cable)
> >>>> - FreeBSD is properly installed in each machine.
> >>>> - a HAST resource "disk0" for ada0p2.
> >>>> - a HAST resource "disk1" for ada1p2.
> >>>> - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is cre=
ated
> >>>> on MASTER
> >>>>=20
> >>>> A couple of questions I am still wondering:
> >>>> - If a disk dies on the MASTER I guess that zpool will not see it and
> >>>> will transparently use the one on BACKUP through the HAST ressource..
> >>>=20
> >>> thats right, as long as writes on $anything have been successful hast=
 is
> >>> happy and wont start whining
> >>>=20
> >>>> is it a problem?=20
> >>>=20
> >>> imho yes, at least from management view
> >>>=20
> >>>> could this lead to some corruption?
> >>>=20
> >>> probably, i never heard about anyone who uses that for long time in
> >>> production
> >>>=20
> >>> At this stage the
> >>>> common sense would be to replace the disk quickly, but imagine the
> >>>> worst case scenario where ada1 on MASTER dies, zpool will not see it=
=20
> >>>> and will transparently use the one from the BACKUP node (through the=
=20
> >>>> "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not=
=20
> >>>> see it and will transparently use the one from the BACKUP node=20
> >>>> (through the "disk0" HAST ressource). At this point on MASTER the tw=
o=20
> >>>> disks are broken but the pool is still considered healthy ... What i=
f=20
> >>>> after that we unplug the em0 network cable on BACKUP? Storage is
> >>>> down..
> >>>> - Under heavy I/O the MASTER box suddently dies (for some reasons),=
=20
> >>>> thanks to CARP the BACKUP node will switch from standy -> active and=
=20
> >>>> execute the failover script which does some "hastctl role primary" f=
or
> >>>> the ressources and a zpool import. I wondered if there are any
> >>>> situations where the pool couldn't be imported (=3D data corruption)?
> >>>> For example what if the pool hasn't been exported on the MASTER befo=
re
> >>>> it dies?
> >>>> - Is it a problem if the NFS daemons are started at boot on the stan=
dby
> >>>> node, or should they only be started in the failover script? What
> >>>> about stale files and active connections on the clients?
> >>>=20
> >>> sometimes stale mounts recover, sometimes not, sometimes clients need
> >>> even reboots
> >>>=20
> >>>> - A catastrophic power failure occur and MASTER and BACKUP are sudde=
ntly
> >>>> powered down. Later the power returns, is it possible that some
> >>>> problem occur (split-brain scenario ?) regarding the order in which =
the
> >>>=20
> >>> sure, you need an exact procedure to recover
> >>>=20
> >>>> two machines boot up?
> >>>=20
> >>> best practice should be to keep everything down after boot
> >>>=20
> >>>> - Other things I have not thought?
> >>>>=20
> >>>=20
> >>>=20
> >>>=20
> >>>> Thanks!
> >>>> Julien
> >>>>=20
> >>>=20
> >>>=20
> >>> imho:
> >>>=20
> >>> leave hast where it is, go for zfs replication. will save your butt,
> >>> sooner or later if you avoid this fragile combination
> >>=20
> >> I was also replying, and finishing by this :
> >> Why don't you set your slave as an iSCSI target and simply do ZFS mirr=
oring ?
> >=20
> > Yes that's another option, so a zpool with two mirrors (local +=20
> > exported iSCSI) ?
>=20
> Yes, you would then have a real time replication solution (as HAST), comp=
ared to ZFS send/receive which is not.
> Depends on what you need :)

More a real time replication solution in fact ... :)
Do you have any resource which resume all the pro(s) and con(s) of HAST
vs iSCSI ? I have found a lot of article on ZFS + HAST but not that much
with ZFS + iSCSI ..=20

>=20
> >=20
> >> ZFS would then know as soon as a disk is failing.
> >> And if the master fails, you only have to import (-f certainly, in cas=
e of a master power failure) on the slave.
> >>=20
> >> Ben
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

--=20
Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

--+xNpyl7Qekk2NvDX
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAABCgAGBQJXdUpaAAoJELK7NxCiBCPABNYQAJ1w/LOIlZMx+d2z52OhKPpU
74wxuBvbwMa+/+P5BSplydYHj3zMjATTolcfS6hR/gjG0Y5C4XVObB6CxsduM3o9
Rmfzsx+Z+e6RhjK7+/DRj+aJvuSIpXkVoUHiCb/u2FO9RpYDdigjLNmW6J+a+kOW
26Wkkzx+bKFi2V7p6KqWQ1/JlXLMbU2xqw1NhF8XaeSTu1Ywcju6VJXKalwdzQug
5+r+CObnogruy3PPVao08/Hxpv7VMp/qQhUb1IlZKuWczEg2GlCPSLjFradDqy32
EvTzqyV0XkxE+DtCUYUAHdul8MUiJZVsCCpNrSuQsyY42637FyZGhiP2iePc6qk3
1GMy3c2JRiZF0f43EZmy4HRNujESwCbfPkXq6oIC4UozgLc7AMuRYA+b9rBlxe3W
fT+fYS/lPun4i7gOZBGcY4FmSO01/7LzqMld51xjdwFq2Gn1PGJU8/LJbaHwF/T/
4gAQJtPjaGZGVMWd6e0yeW5j0bQGCfLWYy7eiTrRe2XegDoFdz3NoH0GyRqNIIo9
Kkad7938fkwsPtKdk2rlNIYDVjHrj1U6V4KyWn8iPY8qeRl+75S0ITk/xc+8jC5K
y5+438BdfYBOjjxvklR9a566JKZBTrXrtgo8FB1hns8qNnHwtz1MLSdwxA67Sigr
fN+2SY2FtAWxfBZWU9qB
=lUJE
-----END PGP SIGNATURE-----

--+xNpyl7Qekk2NvDX--