Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Aug 2016 15:10:40 +0200
From:      Julien Cigar <julien@perdition.city>
To:        Ben RUBSON <ben.rubson@gmail.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: HAST + ZFS + NFS + CARP
Message-ID:  <20160810131040.GH70364@mordor.lan>
In-Reply-To: <6035AB85-8E62-4F0A-9FA8-125B31A7A387@gmail.com>
References:  <20160630144546.GB99997@mordor.lan> <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com> <AD42D8FD-D07B-454E-B79D-028C1EC57381@gmail.com> <20160630153747.GB5695@mordor.lan> <63C07474-BDD5-42AA-BF4A-85A0E04D3CC2@gmail.com> <678321AB-A9F7-4890-A8C7-E20DFDC69137@gmail.com> <20160630185701.GD5695@mordor.lan> <6035AB85-8E62-4F0A-9FA8-125B31A7A387@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--KscVNZbUup0vZz0f
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Jul 02, 2016 at 05:04:22PM +0200, Ben RUBSON wrote:
>=20
> > On 30 Jun 2016, at 20:57, Julien Cigar <julien@perdition.city> wrote:
> >=20
> > On Thu, Jun 30, 2016 at 11:32:17AM -0500, Chris Watson wrote:
> >>=20
> >>=20
> >> Sent from my iPhone 5
> >>=20
> >>>=20
> >>>>=20
> >>>> Yes that's another option, so a zpool with two mirrors (local +=20
> >>>> exported iSCSI) ?
> >>>=20
> >>> Yes, you would then have a real time replication solution (as HAST), =
compared to ZFS send/receive which is not.
> >>> Depends on what you need :)
> >>>=20
> >>>>=20
> >>>>> ZFS would then know as soon as a disk is failing.
> >>=20
> >> So as an aside, but related, for those watching this from the peanut g=
allery and for the benefit of the OP perhaps those that run with this setup=
 might give some best practices and tips here in this thread on making this=
 a good reliable setup. I can see someone reading this thread and tossing t=
wo crappy Ethernet cards in a box and then complaining it doesn't work well=
=2E=20
> >=20
> > It would be more than welcome indeed..! I have the feeling that HAST
> > isn't that much used (but maybe I am wrong) and it's difficult to find=
=20
> > informations on it's reliability and concrete long-term use cases...
> >=20
> > Also the pros vs cons of HAST vs iSCSI
>=20
> I made further testing today.
>=20
> # serverA, serverB :
> kern.iscsi.ping_timeout=3D5
> kern.iscsi.iscsid_timeout=3D5
> kern.iscsi.login_timeout=3D5
> kern.iscsi.fail_on_disconnection=3D1
>=20
> # Preparation :
> - serverB : let's make 2 iSCSI targets : rem3, rem4.
> - serverB : let's start ctld.
> - serverA : let's create a mirror pool made of 4 disks : loc1, loc2, rem3=
, rem4.
> - serverA : pool is healthy.
>=20
> # Test 1 :
> - serverA : put a lot of data into the pool ;
> - serverB : stop ctld ;
> - serverA : put a lot of data into the pool ;
> - serverB : start ctld ;
> - serverA : make all pool disks online : it works, pool is healthy.
>=20
> # Test 2 :
> - serverA : put a lot of data into the pool ;
> - serverA : export the pool ;
> - serverB : import the pool : it does not work, as ctld locks the disks !=
 Good news, nice protection (both servers won't be able to access the same =
disks at the same time).
> - serverB : stop ctld ;
> - serverB : import the pool : it works, 2 disks missing ;
> - serverA : let's make 2 iSCSI targets : rem1, rem2 ;
> - serverB : make all pool disks online : it works, pool is healthy.
>=20
> # Test 3 :
> - serverA : put a lot of data into the pool ;
> - serverB : stop ctld ;
> - serverA : put a lot of data into the pool ;
> - serverB : import the pool : it works, 2 disks missing ;
> - serverA : let's make 2 iSCSI targets : rem1, rem2 ;
> - serverB : make all pool disks online : it works, pool is healthy, but o=
f course data written at step3 is lost.
>=20
> # Test 4 :
> - serverA : put a lot of data into the pool ;
> - serverB : stop ctld ;
> - serverA : put a lot of data into the pool ;
> - serverA : export the pool ;
> - serverA : let's make 2 iSCSI targets : rem1, rem2 ;
> - serverB : import the pool : it works, pool is healthy, data written at =
step3 is here.
>=20
> # Test 5 :
> - serverA : rsync a huge remote repo into the pool in the background ;
> - serverB : stop ctld ;
> - serverA : 2 disks missing, but rsync still runs flawlessly ;
> - serverB : start ctld ;
> - serverA : make all pool disks online : it works, pool is healthy.
> - serverB : ifconfig <replication_interface> down ;
> - serverA : 2 disks missing, but rsync still runs flawlessly ;
> - serverB : ifconfig <replication_interface> up ;
> - serverA : make all pool disks online : it works, pool is healthy.
> - serverB : power reset !
> - serverA : 2 disks missing, but rsync still runs flawlessly ;
> - serverB : let's wait for server to be up ;
> - serverA : make all pool disks online : it works, pool is healthy.
>=20
> Quite happy with these tests actually :)

Hello,

So, after testing ZFS replication with zrep (which works more or less
perfectly) I'm busy to experiment a ZFS + iSCSI solution with two small
HP DL20 and 2 disks in each. Machines are partitionned the same=20
(https://gist.github.com/silenius/d3fdcd52ab35957f37527af892615ca7)=20
with a zfs root
(https://gist.github.com/silenius/f347e90ab187495cdea6e3baf64b881b)

On filer2.prod.lan I have exported the two dedicated partitions
(/dev/da0p4 and /dev/da1p4) as an iSCSI target
(https://gist.github.com/silenius/8efda8334cb16cd779efff027ff5f3bd)
which are available on filer1.prod.lan as /dev/da3 and /dev/da4
(https://gist.github.com/silenius/f6746bc02ae1a5fb7e472e5f5334238b)

Then on filer1.prod.lan I made a zpool mirror over those 4 disks
(https://gist.github.com/silenius/eecd61ad07385e16b41b05e6d2373a9a)

Interfaces are configured as the following:
https://gist.github.com/silenius/4af55df446f82319eaf072049bc9a287 with
"bge1" being the dedicated interface for iSCSI traffic, and "bge0" the
"main" interface through which $clients access the filer (it has a
floating IP 192.168.10.15). (I haven't made any network optimizations=20
yet)

Primary results are encouraging too, although I haven't tested under
heavy write yet. I made more or less what Ben did above, trying to
corrupt the pool and ... without success :) I also checked manually
with: $> md5 -qs "$(find -s DIR -type f -print0|xargs -0 md5 -q)" to
check the integrity of the DIR I copied.

I tried also a basic failover scenario with
https://gist.github.com/silenius/b81e577f0f0a37bf7773ef15f7d05b5d which
seems to work atm.=20

To avoid a split-brain scenario I think it is also very important that
the pool isn't automatically mounted at boot (so setting cachefile=3Dnone)

Comments ? :)

Julien

>=20
> Ben
>=20
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

--=20
Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

--KscVNZbUup0vZz0f
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAABCgAGBQJXqyfNAAoJELK7NxCiBCPAn60QAIBd0aHdlnnJEUXV/QxmtEoH
79RadAUJov0C2feea/SPBWaPXAqc/k5Q8BTLNABsSBu0LFcKzDDzVTbQ8l+JhvIn
MkcMRICuiFMRYQvE3LNLthZAQrzTguJkcYYshTFvYmk6qSYpCWwPQvtlfw64a9bC
Eclk7889otGlRRL3Bi44MshoGuCsFlh9jrNrKlSjQJxNZfO49UysejZxALIKSnw0
lp4J/ByT/AfcSNMjwBYxYPZ08jUiq1Fjo7CYQJuvQlBll/GxirRxTypQPxe7jGEf
Ij9eKp/gyTNgOrn/i7TZj8LmLKtsYM4XOKjaMYnrA0yQ49+Ez331Ub9Fta20NyzU
fPn0qusttxMy9nc9GZN4QabV5LI36p85cQjiibF22euuyB0jf+EwE6kqPdYGSYeH
+5Nrl0hDln62RfXEihwJu0oqNta8/uFlCoEVBvZkZBf2rGJLc7yi+TqHq5jODv37
H/PFBpIYz+t0z8EzA8uZYLgQ3hATEz+z9+PaVaxYqGDbMehy+4o51GQ7O/aJjVoL
bs3LxJzkiElNx+32lmWrq2gcdpn5ZZQTQQr0hV7Uzw/VoWQcz39C5gCEIKuT8us4
6OQ4Slgbrnb8Vx3Na0H1tGaH9T8+Nthn1GQpJGlSISH5e9FRMvsPzCg2vubZdhlm
jZv2Dt8MqA9Bttrml/BG
=lU8A
-----END PGP SIGNATURE-----

--KscVNZbUup0vZz0f--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160810131040.GH70364>