From owner-freebsd-fs@freebsd.org Fri Jul 1 16:55:02 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D2D9FB8FE10 for ; Fri, 1 Jul 2016 16:55:02 +0000 (UTC) (envelope-from bsdunix44@gmail.com) Received: from mail-it0-x236.google.com (mail-it0-x236.google.com [IPv6:2607:f8b0:4001:c0b::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 981A72C03; Fri, 1 Jul 2016 16:55:02 +0000 (UTC) (envelope-from bsdunix44@gmail.com) Received: by mail-it0-x236.google.com with SMTP id f6so21261790ith.0; Fri, 01 Jul 2016 09:55:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:mime-version:in-reply-to:content-transfer-encoding :message-id:cc:from:subject:date:to; bh=5dXRWQu/cVf64ay9o+j2fFSrPEU/eVZtsKQ3wZ49En0=; b=SndLyFTIl4QfubcgLneNQVE0vf8oYZeFn2V1nvB97da3x1O+LkupNd4oSEEBtbkhSr 4J47acLi0KTZ38tUERbKqMGGi7mWYxjLlamCw3UEvN0FLbyGbwYtzUx0EAz3/owAcHP5 +yQR+43ktRfpUKlFcO6kqGXlRwko0awJnu4LbieKqRJR4Vw/FQ0PxrcJTwlSF94Br51n fmSXUPSS1kq9l5zTgpvq9NVayoOk0EawGQxT+GM+IKqmTG/ysw1VYs3LIKGP5bXSDCZ7 1IXCv17Ffa2bFSC77eqgnC6/TkZjUQHHxU78Bmg53O0kvf6aFPZP7/jy1Thg7L50Wqsw M99w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:references:mime-version:in-reply-to :content-transfer-encoding:message-id:cc:from:subject:date:to; bh=5dXRWQu/cVf64ay9o+j2fFSrPEU/eVZtsKQ3wZ49En0=; b=Gnszg7D0m1eTaPdKuPa242naR6Pw4YFjXYjMTznKbEpNYTkP+fpwW22la4GVjn/DSH Xz+EHDfHJYZ/oVTqeijAwezScPhSbim0pqnnaduI/19jddwHaBMQui+lfTkGXjk+KOBl bAgCWR4uAy5i9Ep9QtV5zs2yLnzhZutXijP6fOQdncCdrI46/qlYfKhyZEY45OycwOSv MuhfMjNoc7SrhEjAl9E4PwJahMeV+pM8MNGpM36LQ4xiv99XMdeRlVEU1ONwGEbI/Qxx 3cp/afJDh6qnxuN0ZF357GWVwkDY6qbLlvVjlIAGjGJZJQnCqEb3kNQ2Be3vq7OA+zrT IHNw== X-Gm-Message-State: ALyK8tJCCaJH4Ip5FzKHIfKvgu39w14Po4Azc4SvFkBLYmMq9tvK2Z2SX7XmL3t9coW7cw== X-Received: by 10.36.14.77 with SMTP id 74mr551153ite.98.1467392101801; Fri, 01 Jul 2016 09:55:01 -0700 (PDT) Received: from [100.93.55.189] ([172.56.13.148]) by smtp.gmail.com with ESMTPSA id d126sm6366721iog.20.2016.07.01.09.54.59 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 01 Jul 2016 09:55:00 -0700 (PDT) References: <20160701101524.GF5695@mordor.lan> <20160701105735.GG5695@mordor.lan> <3d8c7c89-b24e-9810-f3c2-11ec1e15c948@internetx.com> <93E50E6B-8248-43B5-BE94-D94D53050E06@getsomewhere.net> <20160701143917.GB41276@mordor.lan> <01b8a61e-739e-c41e-45bc-a84af0a9d8ab@internetx.com> <4d13f123-de18-693a-f98b-d02c8864f02e@internetx.com> <20160701151146.GD41276@mordor.lan> <20160701154658.GA70150@in-addr.com> Mime-Version: 1.0 (1.0) In-Reply-To: <20160701154658.GA70150@in-addr.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Message-Id: <7D9B465F-0248-4F6D-895D-7BC3684EC6F7@gmail.com> Cc: Julien Cigar , freebsd-fs@freebsd.org X-Mailer: iPhone Mail (13F69) From: Chris Watson Subject: Re: HAST + ZFS + NFS + CARP Date: Fri, 1 Jul 2016 11:54:58 -0500 To: Gary Palmer X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2016 16:55:02 -0000 Hi Gary! So I'll add another voice to the KISS camp. I'd rather have two boxes each w= ith two NICs attached to each other doing zfs replication from A to B. Addin= g more redundant hardware just adds more points of failure. NICs have no mov= ing parts so as long as they are thermally controlled they won't fail. This i= s simple and as safe as you can get. As for how to handle an actual failover= is really like to try out the ctl-ha option. Maybe this weekend.=20 Sent from my iPhone 5 > On Jul 1, 2016, at 10:46 AM, Gary Palmer wrote: >=20 >> On Fri, Jul 01, 2016 at 05:11:47PM +0200, Julien Cigar wrote: >>> On Fri, Jul 01, 2016 at 04:44:24PM +0200, InterNetX - Juergen Gotteswint= er wrote: >>> dont get me wrong, what i try to say is that imho you are trying to >>> reach something which looks great until something goes wrong. >>=20 >> I agree..! :) >>=20 >>>=20 >>> keep it simple, stupid simple, without much moving parts and avoid >>> automagic voodoo wherever possible. >>=20 >> to be honnest I've always been relunctant to "automatic failover", as I >> think the problem is always not "how" to do it but "when".. and as Rick >> said "The simpler/reliable way would be done manually be a sysadmin".. >=20 > I agree. They can verify that the situation needs a fail over much better= > than any script. In a previous job I heard of a setup where the cluster > manager software on the standby node decided that the active node was > down so it did a force takeover of the disks. Since the active node was > still up it somehow managed to wipe out the partition tables on the disks > along with the vxvm configuration (Veritas Volume Manager) inside the > partitions. >=20 > They were restoring the partition tables and vxvm config from backups. > =46rom what I remember the backups were printouts, which made it slow goin= g > as they had to be re-entered by hand. The system probably had dozens > of disks (I don't know, but I know what role it was serving so I can > guess) >=20 > I'd rather not see that happen ever again >=20 > (this was 15+ years ago FWIW, but the lesson is still applicable today) >=20 > Gary >=20 >>=20 >>>> Am 01.07.2016 um 16:41 schrieb InterNetX - Juergen Gotteswinter: >>>>> Am 01.07.2016 um 16:39 schrieb Julien Cigar: >>>>>> On Fri, Jul 01, 2016 at 03:44:36PM +0200, InterNetX - Juergen Gottesw= inter wrote: >>>>>>=20 >>>>>>=20 >>>>>>> Am 01.07.2016 um 15:18 schrieb Joe Love: >>>>>>>=20 >>>>>>>> On Jul 1, 2016, at 6:09 AM, InterNetX - Juergen Gotteswinter wrote: >>>>>>>>=20 >>>>>>>> Am 01.07.2016 um 12:57 schrieb Julien Cigar: >>>>>>>>> On Fri, Jul 01, 2016 at 12:18:39PM +0200, InterNetX - Juergen Gott= eswinter wrote: >>>>>>>>>=20 >>>>>>>>> of course I'll test everything properly :) I don't have the hardwa= re yet >>>>>>>>> so ATM I'm just looking for all the possible "candidates", and I'm= =20 >>>>>>>>> aware that a redundant storage is not that easy to implement ... >>>>>>>>>=20 >>>>>>>>> but what solutions do we have? It's either CARP + ZFS + (HAST|iSCS= I),=20 >>>>>>>>> either zfs send|ssh zfs receive as you suggest (but it's >>>>>>>>> not realtime), either a distributed FS (which I avoid like the pla= gue..) >>>>>>>>=20 >>>>>>>> zfs send/receive can be nearly realtime. >>>>>>>>=20 >>>>>>>> external jbods with cross cabled sas + commercial cluster solution l= ike >>>>>>>> rsf-1. anything else is a fragile construction which begs for desas= ter. >>>>>>>=20 >>>>>>> This sounds similar to the CTL-HA code that went in last year, for w= hich I haven???t seen any sort of how-to. The RSF-1 stuff sounds like it ha= s more scaling options, though. Which it probably should, given its commerc= ial operation. >>>>>>=20 >>>>>> rsf is what pacemaker / heartbeat tries to be, judge me for linking >>>>>> whitepapers but in this case its not such evil marketing blah >>>>>>=20 >>>>>> http://www.high-availability.com/wp-content/uploads/2013/01/RSF-1-HA-= PLUGIN-ZFS-STORAGE-CLUSTER.pdf >>>>>>=20 >>>>>>=20 >>>>>> @ Julien >>>>>>=20 >>>>>> seems like you take availability really serious, so i guess you also g= ot >>>>>> plans how to accomplish network problems like dead switches, flaky >>>>>> cables and so on. >>>>>>=20 >>>>>> like using multiple network cards in the boxes, cross cabling between= >>>>>> the hosts (rs232 and ethernet of course, using proved reliable networ= k >>>>>> switches in a stacked configuration for example cisco 3750 stacked). n= ot >>>>>> to forget redundant power feeds to redundant power supplies. >>>>>=20 >>>>> the only thing that is not redundant (yet?) is our switch, an HP Pro=20= >>>>> Curve 2530-24G) .. it's the next step :) >>>>=20 >>>> Arubas, okay, a quick view in the spec sheet does not seem to list >>>> stacking option. >>>>=20 >>>> what about power? >>>>=20 >>>>>=20 >>>>>>=20 >>>>>> if not, i whould start again from scratch. >>>>>>=20 >>>>>>>=20 >>>>>>> -Joe >>>>>>>=20 >>>>>>> _______________________________________________ >>>>>>> freebsd-fs@freebsd.org mailing list >>>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>>>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org= " >>>>>> _______________________________________________ >>>>>> freebsd-fs@freebsd.org mailing list >>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"= >>>>=20 >>>> _______________________________________________ >>>> freebsd-fs@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>=20 >> --=20 >> Julien Cigar >> Belgian Biodiversity Platform (http://www.biodiversity.be) >> PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 >> No trees were killed in the creation of this message. >> However, many electrons were terribly inconvenienced. >=20 >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"