From owner-freebsd-fs@freebsd.org Fri Nov 11 16:14:05 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1734DC3AE0F for ; Fri, 11 Nov 2016 16:14:05 +0000 (UTC) (envelope-from julien@perdition.city) Received: from relay-b03.edpnet.be (relay-b03.edpnet.be [212.71.1.220]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "edpnet.email", Issuer "Go Daddy Secure Certificate Authority - G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B2783186C for ; Fri, 11 Nov 2016 16:14:04 +0000 (UTC) (envelope-from julien@perdition.city) X-ASG-Debug-ID: 1478879856-0a88184196d6300001-3nHGF7 Received: from mordor.lan (77.109.124.121.adsl.dyn.edpnet.net [77.109.124.121]) by relay-b03.edpnet.be with ESMTP id 5mm1o0JjAjm0Pkha (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 11 Nov 2016 16:57:37 +0100 (CET) X-Barracuda-Envelope-From: julien@perdition.city X-Barracuda-Effective-Source-IP: 77.109.124.121.adsl.dyn.edpnet.net[77.109.124.121] X-Barracuda-Apparent-Source-IP: 77.109.124.121 Date: Fri, 11 Nov 2016 16:57:35 +0100 From: Julien Cigar To: Palle Girgensohn Cc: freebsd-fs@freebsd.org, Julian Akehurst Subject: Re: Best practice for high availability ZFS pool Message-ID: <20161111155735.GM81247@mordor.lan> X-ASG-Orig-Subj: Re: Best practice for high availability ZFS pool References: <5E69742D-D2E0-437F-B4A9-A71508C370F9@FreeBSD.org> <5DA13472-F575-4D3D-80B7-1BE371237CE5@getsomewhere.net> <8E674522-17F0-46AC-B494-F0053D87D2B0@pingpong.net> <5127A334-0805-46B8-9CD9-FD8585CB84F3@chittenden.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="Z/kiM2A+9acXa48/" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) X-Barracuda-Connect: 77.109.124.121.adsl.dyn.edpnet.net[77.109.124.121] X-Barracuda-Start-Time: 1478879856 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://212.71.1.220:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 5586 X-Virus-Scanned: by bsmtpd at edpnet.be X-Barracuda-BRTS-Status: 1 X-Barracuda-Bayes: INNOCENT GLOBAL 0.5000 1.0000 0.0100 X-Barracuda-Spam-Score: 0.01 X-Barracuda-Spam-Status: No, SCORE=0.01 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=6.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.34421 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Nov 2016 16:14:05 -0000 --Z/kiM2A+9acXa48/ Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Nov 11, 2016 at 04:16:52PM +0100, Palle Girgensohn wrote: > Hi, >=20 > Pinging this old thread. >=20 > We have revisited this question: >=20 > A simple stable solution for a redundant storage with little or no down t= ime when a machine breaks. Storage is served using NFS only. >=20 >=20 > It seems true HA is always complicated. I'd rather go for a simple unders= tandable solution and accept sub minute downtime rather than a complicated = solution. For our needs, the pretty solution lined up in the FreeBSD Magazi= ne seems a bit overly complicated. >=20 > So here's what we are pondering: >=20 > - one SAS dual port disk box >=20 > - connect a master host machine to one port and a slave host machine to t= he the other port >=20 > - one host is MASTER, it serves all requests >=20 > - one host is SLAVE, doing nothing but waiting for the MASTER to fail >=20 > - fail over would be handled with zpool export / zpool import, or just zp= ool import -F if the master dies. >=20 > - MASTER/SLAVE election and avoiding split brain using for example CARP. >=20 > This is not a real HA solution since zpool import takes about a minute. I= s this true for a large array? >=20 > Would this suggestion work? I'm using someting like this here, a zpool over 2 local disks and 2 iscsi disks and the following failover script: https://gist.github.com/silenius/cb10171498071bdbf6040e30a0cab5c2 It works like a charm except that I'm having this issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D211990=20 (apparently this problem does not appear on 11.0-RELEASE) >=20 > Are there better ideas out there? >=20 > Cheers, > Palle >=20 >=20 >=20 >=20 >=20 >=20 > > 18 maj 2016 kl. 09:58 skrev Sean Chittenden : > >=20 > > https://www.freebsdfoundation.org/wp-content/uploads/2015/12/vol2_no4_g= roupon.pdf > >=20 > > mps(4) was good to us. What=E2=80=99s your workload? -sc > >=20 > > -- > > Sean Chittenden > > sean@chittenden.org > >=20 > >=20 > >> On May 18, 2016, at 03:53 , Palle Girgensohn wro= te: > >>=20 > >>=20 > >>=20 > >>> 17 maj 2016 kl. 18:13 skrev Joe Love : > >>>=20 > >>>=20 > >>>> On May 16, 2016, at 5:08 AM, Palle Girgensohn w= rote: > >>>>=20 > >>>> Hi, > >>>>=20 > >>>> We need to set up a ZFS pool with redundance. The main goal is high = availability - uptime. > >>>>=20 > >>>> I can see a few of paths to follow. > >>>>=20 > >>>> 1. HAST + ZFS > >>>>=20 > >>>> 2. Some sort of shared storage, two machines sharing a JBOD box. > >>>>=20 > >>>> 3. ZFS replication (zfs snapshot + zfs send | ssh | zfs receive) > >>>>=20 > >>>> 4. using something else than ZFS, even a different OS if required. > >>>>=20 > >>>> My main concern with HAST+ZFS is performance. Google offer some insi= ghts here, I find mainly unsolved problems. Please share any success storie= s or other experiences. > >>>>=20 > >>>> Shared storage still has a single point of failure, the JBOD box. Ap= art from that, is there even any support for the kind of storage PCI cards = that support dual head for a storage box? I cannot find any. > >>>>=20 > >>>> We are running with ZFS replication today, but it is just too slow f= or the amount of data. > >>>>=20 > >>>> We prefer to keep ZFS as we already have a rather big (~30 TB) pool = and also tools, scripts, backup all is using ZFS, but if there is no soluti= on using ZFS, we're open to alternatives. Nexenta springs to mind, but I be= lieve it is using shared storage for redundance, so it does have single poi= nts of failure? > >>>>=20 > >>>> Any other suggestions? Please share your experience. :) > >>>>=20 > >>>> Palle > >>>=20 > >>> I don=E2=80=99t know if this falls into the realm of what you want, b= ut BSDMag just released an issue with an article entitled =E2=80=9CAdding Z= FS to the FreeBSD dual-controller storage concept.=E2=80=9D > >>> https://bsdmag.org/download/reusing_openbsd/ > >>>=20 > >>> My understanding in this setup is that the only single point of failu= re for this model is the backplanes that the drives would connect to. Depe= nding on your controller cards, this could be alleviated by simply using mu= ltiple drive shelves, and only using one drive/shelf as part of a vdev (the= n stripe or whatnot over your vdevs). > >>>=20 > >>> It might not be what you=E2=80=99re after, as it=E2=80=99s basically = two systems with their own controllers, with a shared set of drives. Some = expansion from the virtual world to real physical systems will probably nee= d additional variations. > >>> I think the TrueNAS system (with HA) is setup similar to this, only w= ithout the split between the drives being primarily handled by separate con= trollers, but someone with more in-depth knowledge would need to confirm/de= ny this. > >>>=20 > >>> -Jo > >>=20 > >> Hi, > >>=20 > >> Do you know any specific controllers that work with dual head? > >>=20 > >> Thanks., > >> Palle > >>=20 > >>=20 > >> _______________________________________________ > >> freebsd-fs@freebsd.org mailing list > >> https://lists.freebsd.org/mailman/listinfo/freebsd-fs > >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > >=20 >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" --=20 Julien Cigar Belgian Biodiversity Platform (http://www.biodiversity.be) PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced. --Z/kiM2A+9acXa48/ Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIcBAABCgAGBQJYJepsAAoJELK7NxCiBCPAIrEP/jHo0CcYTvOTQ25g7UgiVtJ+ S+L4WnlyW2NfXG5gAHMxouXertFemS7YgoO1+5zYeDd9V9I17mNF/R6K9nA96Bkv u3DrPCc5NkWUREBv1dskS5TQqkdyHSZ2wBrtKl2A+1S8bab8axOsGiqj3u+BL24E jRzOpu5osOv8HHN2tAsh9dNe9F+q4JZrvDTI1s9EiIwtqQdG3hhB+33dVBSBXEqc Wjll20O7y9UxTSexn+9/DNePKDrlcT+MHLS/nX+Wpo5fXKBU3Q3FoBkZTzxoqro6 Xx8/VDQQhqlYS+pqApEFZywP4on3zxdw69o4KcRH7oNISLI8kbipktFYQD1xXola Za72p/tS86Xt9xcge0F2mM/CBQkH7UfPNc6XYgYkRHHLadTsKoM1K4kmMrrSuxnD Vavd50AzhzMdyGAsAPbdg12+Vox+gzrYvFl/zWVJDQFl6sJj6U6jjr83e+G9amaf dyHNeFLcMEJvH2judrAPbBC/tQghjWllbzkVOSC3RhmXea77bxspWcGzxdD7Bjy6 1TMQ2DJGqY0eERDzqbx3LiPsXgnnqRK4Q/h380YXQkFkUpDk/a6jKAVWLCHnFT3p eTqr5KOQ49+qo1+Gux1bBO53YgC7qXLh+ocEQLO5smBjn+ik4jzB4V/nmMET7AAg JrG6as14ZC6j+WWeSJ/x =QYE3 -----END PGP SIGNATURE----- --Z/kiM2A+9acXa48/--