From owner-freebsd-fs@freebsd.org Fri Jul 1 15:47:02 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7548EB8EB06 for ; Fri, 1 Jul 2016 15:47:02 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from mail.in-addr.com (mail.in-addr.com [IPv6:2a01:4f8:191:61e8::2525:2525]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 31E152572 for ; Fri, 1 Jul 2016 15:47:02 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by mail.in-addr.com with local (Exim 4.87 (FreeBSD)) (envelope-from ) id 1bJ0eh-000PuH-2T; Fri, 01 Jul 2016 16:46:59 +0100 Date: Fri, 1 Jul 2016 16:46:58 +0100 From: Gary Palmer To: Julien Cigar Cc: jg@internetx.com, freebsd-fs@freebsd.org Subject: Re: HAST + ZFS + NFS + CARP Message-ID: <20160701154658.GA70150@in-addr.com> References: <20160701101524.GF5695@mordor.lan> <20160701105735.GG5695@mordor.lan> <3d8c7c89-b24e-9810-f3c2-11ec1e15c948@internetx.com> <93E50E6B-8248-43B5-BE94-D94D53050E06@getsomewhere.net> <20160701143917.GB41276@mordor.lan> <01b8a61e-739e-c41e-45bc-a84af0a9d8ab@internetx.com> <4d13f123-de18-693a-f98b-d02c8864f02e@internetx.com> <20160701151146.GD41276@mordor.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160701151146.GD41276@mordor.lan> X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on mail.in-addr.com); SAEximRunCond expanded to false X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2016 15:47:02 -0000 On Fri, Jul 01, 2016 at 05:11:47PM +0200, Julien Cigar wrote: > On Fri, Jul 01, 2016 at 04:44:24PM +0200, InterNetX - Juergen Gotteswinter wrote: > > dont get me wrong, what i try to say is that imho you are trying to > > reach something which looks great until something goes wrong. > > I agree..! :) > > > > > keep it simple, stupid simple, without much moving parts and avoid > > automagic voodoo wherever possible. > > > > to be honnest I've always been relunctant to "automatic failover", as I > think the problem is always not "how" to do it but "when".. and as Rick > said "The simpler/reliable way would be done manually be a sysadmin".. I agree. They can verify that the situation needs a fail over much better than any script. In a previous job I heard of a setup where the cluster manager software on the standby node decided that the active node was down so it did a force takeover of the disks. Since the active node was still up it somehow managed to wipe out the partition tables on the disks along with the vxvm configuration (Veritas Volume Manager) inside the partitions. They were restoring the partition tables and vxvm config from backups. >From what I remember the backups were printouts, which made it slow going as they had to be re-entered by hand. The system probably had dozens of disks (I don't know, but I know what role it was serving so I can guess) I'd rather not see that happen ever again (this was 15+ years ago FWIW, but the lesson is still applicable today) Gary > > > Am 01.07.2016 um 16:41 schrieb InterNetX - Juergen Gotteswinter: > > > Am 01.07.2016 um 16:39 schrieb Julien Cigar: > > >> On Fri, Jul 01, 2016 at 03:44:36PM +0200, InterNetX - Juergen Gotteswinter wrote: > > >>> > > >>> > > >>> Am 01.07.2016 um 15:18 schrieb Joe Love: > > >>>> > > >>>>> On Jul 1, 2016, at 6:09 AM, InterNetX - Juergen Gotteswinter wrote: > > >>>>> > > >>>>> Am 01.07.2016 um 12:57 schrieb Julien Cigar: > > >>>>>> On Fri, Jul 01, 2016 at 12:18:39PM +0200, InterNetX - Juergen Gotteswinter wrote: > > >>>>>> > > >>>>>> of course I'll test everything properly :) I don't have the hardware yet > > >>>>>> so ATM I'm just looking for all the possible "candidates", and I'm > > >>>>>> aware that a redundant storage is not that easy to implement ... > > >>>>>> > > >>>>>> but what solutions do we have? It's either CARP + ZFS + (HAST|iSCSI), > > >>>>>> either zfs send|ssh zfs receive as you suggest (but it's > > >>>>>> not realtime), either a distributed FS (which I avoid like the plague..) > > >>>>> > > >>>>> zfs send/receive can be nearly realtime. > > >>>>> > > >>>>> external jbods with cross cabled sas + commercial cluster solution like > > >>>>> rsf-1. anything else is a fragile construction which begs for desaster. > > >>>> > > >>>> This sounds similar to the CTL-HA code that went in last year, for which I haven???t seen any sort of how-to. The RSF-1 stuff sounds like it has more scaling options, though. Which it probably should, given its commercial operation. > > >>> > > >>> rsf is what pacemaker / heartbeat tries to be, judge me for linking > > >>> whitepapers but in this case its not such evil marketing blah > > >>> > > >>> http://www.high-availability.com/wp-content/uploads/2013/01/RSF-1-HA-PLUGIN-ZFS-STORAGE-CLUSTER.pdf > > >>> > > >>> > > >>> @ Julien > > >>> > > >>> seems like you take availability really serious, so i guess you also got > > >>> plans how to accomplish network problems like dead switches, flaky > > >>> cables and so on. > > >>> > > >>> like using multiple network cards in the boxes, cross cabling between > > >>> the hosts (rs232 and ethernet of course, using proved reliable network > > >>> switches in a stacked configuration for example cisco 3750 stacked). not > > >>> to forget redundant power feeds to redundant power supplies. > > >> > > >> the only thing that is not redundant (yet?) is our switch, an HP Pro > > >> Curve 2530-24G) .. it's the next step :) > > > > > > Arubas, okay, a quick view in the spec sheet does not seem to list > > > stacking option. > > > > > > what about power? > > > > > >> > > >>> > > >>> if not, i whould start again from scratch. > > >>> > > >>>> > > >>>> -Joe > > >>>> > > >>>> _______________________________________________ > > >>>> freebsd-fs@freebsd.org mailing list > > >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > >>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > >>>> > > >>> _______________________________________________ > > >>> freebsd-fs@freebsd.org mailing list > > >>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > >> > > > > > > _______________________________________________ > > > freebsd-fs@freebsd.org mailing list > > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > > > -- > Julien Cigar > Belgian Biodiversity Platform (http://www.biodiversity.be) > PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 > No trees were killed in the creation of this message. > However, many electrons were terribly inconvenienced.