Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Jun 2016 15:31:29 +0000
From:      Matt Churchyard <matt.churchyard@userve.net>
To:        "jg@internetx.com" <jg@internetx.com>, Julien Cigar <julien@perdition.city>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   RE: HAST + ZFS + NFS + CARP
Message-ID:  <9b946ad7e3484099a0067b74ab75faa4@SERVER.ad.usd-group.com>
In-Reply-To: <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com>
References:  <20160630144546.GB99997@mordor.lan> <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com>

next in thread | previous in thread | raw e-mail | index | archive | help
-----Original Message-----
From: owner-freebsd-fs@freebsd.org [mailto:owner-freebsd-fs@freebsd.org] On=
 Behalf Of InterNetX - Juergen Gotteswinter
Sent: 30 June 2016 16:14
To: Julien Cigar; freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS + NFS + CARP



Am 30.06.2016 um 16:45 schrieb Julien Cigar:
> Hello,
>=20
> I'm always in the process of setting a redundant low-cost storage for=20
> our (small, ~30 people) team here.
>=20
> I read quite a lot of articles/documentations/etc and I plan to use=20
> HAST with ZFS for the storage, CARP for the failover and the "good old NF=
S"
> to mount the shares on the clients.
>=20
> The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for=20
> the shared storage.
>=20
> Assuming the following configuration:
> - MASTER is the active node and BACKUP is the standby node.
> - two disks in each machine: ada0 and ada1.
> - two interfaces in each machine: em0 and em1
> - em0 is the primary interface (with CARP setup)
> - em1 is dedicated to the HAST traffic (crossover cable)
> - FreeBSD is properly installed in each machine.
> - a HAST resource "disk0" for ada0p2.
> - a HAST resource "disk1" for ada1p2.
> - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is created
>   on MASTER
>=20
> A couple of questions I am still wondering:
> - If a disk dies on the MASTER I guess that zpool will not see it and
>   will transparently use the one on BACKUP through the HAST ressource..

thats right, as long as writes on $anything have been successful hast is ha=
ppy and wont start whining

>   is it a problem?=20

imho yes, at least from management view

> could this lead to some corruption?

probably, i never heard about anyone who uses that for long time in product=
ion

 At this stage the
>   common sense would be to replace the disk quickly, but imagine the
>   worst case scenario where ada1 on MASTER dies, zpool will not see it=20
>   and will transparently use the one from the BACKUP node (through the=20
>   "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not=20
>   see it and will transparently use the one from the BACKUP node=20
>   (through the "disk0" HAST ressource). At this point on MASTER the two=20
>   disks are broken but the pool is still considered healthy ... What if=20
>   after that we unplug the em0 network cable on BACKUP? Storage is
>   down..
> - Under heavy I/O the MASTER box suddently dies (for some reasons),=20
>   thanks to CARP the BACKUP node will switch from standy -> active and=20
>   execute the failover script which does some "hastctl role primary" for
>   the ressources and a zpool import. I wondered if there are any
>   situations where the pool couldn't be imported (=3D data corruption)?
>   For example what if the pool hasn't been exported on the MASTER before
>   it dies?
> - Is it a problem if the NFS daemons are started at boot on the standby
>   node, or should they only be started in the failover script? What
>   about stale files and active connections on the clients?

>sometimes stale mounts recover, sometimes not, sometimes clients need even=
 reboots

> - A catastrophic power failure occur and MASTER and BACKUP are suddently
>   powered down. Later the power returns, is it possible that some
>   problem occur (split-brain scenario ?) regarding the order in which=20
> the

>sure, you need an exact procedure to recover

Happy to be correctly, but last time I looked at this, the NFS filesystem I=
D was likely to be different on both systems (and cannot be set like on Lin=
ux), and so the mounts would be useless on the clients after failover. You'=
d need to remount the NFS filesystem on the clients.

>   two machines boot up?

>best practice should be to keep everything down after boot

> - Other things I have not thought?
>=20



> Thanks!
> Julien
>=20


>imho:

>leave hast where it is, go for zfs replication. will save your butt, soone=
r or later if you avoid this fragile combination=20

Personally I agree. This sort of functionality is incredibly difficult to g=
et right and I wouldn't want to run anything critical relying on a few HAST=
 scripts I'd put together manually.

Matt





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9b946ad7e3484099a0067b74ab75faa4>