From owner-freebsd-fs@freebsd.org Thu Jun 30 15:24:11 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B199BB87D94 for ; Thu, 30 Jun 2016 15:24:11 +0000 (UTC) (envelope-from jg@internetx.com) Received: from mx1.internetx.com (mx1.internetx.com [62.116.129.39]) by mx1.freebsd.org (Postfix) with ESMTP id 45753227A for ; Thu, 30 Jun 2016 15:24:10 +0000 (UTC) (envelope-from jg@internetx.com) Received: from localhost (localhost [127.0.0.1]) by mx1.internetx.com (Postfix) with ESMTP id 1C7184C4C84C; Thu, 30 Jun 2016 17:14:13 +0200 (CEST) X-Virus-Scanned: InterNetX GmbH amavisd-new at ix-mailer.internetx.de Received: from mx1.internetx.com ([62.116.129.39]) by localhost (ix-mailer.internetx.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hzs+ZxqUWwOi; Thu, 30 Jun 2016 17:14:11 +0200 (CEST) Received: from [192.168.100.26] (pizza.internetx.de [62.116.129.3]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.internetx.com (Postfix) with ESMTPSA id EB0C445FC0E4; Thu, 30 Jun 2016 17:14:10 +0200 (CEST) Subject: Re: HAST + ZFS + NFS + CARP References: <20160630144546.GB99997@mordor.lan> To: Julien Cigar , freebsd-fs@freebsd.org Reply-To: jg@internetx.com From: InterNetX - Juergen Gotteswinter Message-ID: <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com> Date: Thu, 30 Jun 2016 17:14:08 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20160630144546.GB99997@mordor.lan> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jun 2016 15:24:11 -0000 Am 30.06.2016 um 16:45 schrieb Julien Cigar: > Hello, > > I'm always in the process of setting a redundant low-cost storage for > our (small, ~30 people) team here. > > I read quite a lot of articles/documentations/etc and I plan to use HAST > with ZFS for the storage, CARP for the failover and the "good old NFS" > to mount the shares on the clients. > > The hardware is 2xHP Proliant DL20 boxes with 2 dedicated disks for the > shared storage. > > Assuming the following configuration: > - MASTER is the active node and BACKUP is the standby node. > - two disks in each machine: ada0 and ada1. > - two interfaces in each machine: em0 and em1 > - em0 is the primary interface (with CARP setup) > - em1 is dedicated to the HAST traffic (crossover cable) > - FreeBSD is properly installed in each machine. > - a HAST resource "disk0" for ada0p2. > - a HAST resource "disk1" for ada1p2. > - a zpool create zhast mirror /dev/hast/disk0 /dev/hast/disk1 is created > on MASTER > > A couple of questions I am still wondering: > - If a disk dies on the MASTER I guess that zpool will not see it and > will transparently use the one on BACKUP through the HAST ressource.. thats right, as long as writes on $anything have been successful hast is happy and wont start whining > is it a problem? imho yes, at least from management view > could this lead to some corruption? probably, i never heard about anyone who uses that for long time in production At this stage the > common sense would be to replace the disk quickly, but imagine the > worst case scenario where ada1 on MASTER dies, zpool will not see it > and will transparently use the one from the BACKUP node (through the > "disk1" HAST ressource), later ada0 on MASTER dies, zpool will not > see it and will transparently use the one from the BACKUP node > (through the "disk0" HAST ressource). At this point on MASTER the two > disks are broken but the pool is still considered healthy ... What if > after that we unplug the em0 network cable on BACKUP? Storage is > down.. > - Under heavy I/O the MASTER box suddently dies (for some reasons), > thanks to CARP the BACKUP node will switch from standy -> active and > execute the failover script which does some "hastctl role primary" for > the ressources and a zpool import. I wondered if there are any > situations where the pool couldn't be imported (= data corruption)? > For example what if the pool hasn't been exported on the MASTER before > it dies? > - Is it a problem if the NFS daemons are started at boot on the standby > node, or should they only be started in the failover script? What > about stale files and active connections on the clients? sometimes stale mounts recover, sometimes not, sometimes clients need even reboots > - A catastrophic power failure occur and MASTER and BACKUP are suddently > powered down. Later the power returns, is it possible that some > problem occur (split-brain scenario ?) regarding the order in which the sure, you need an exact procedure to recover > two machines boot up? best practice should be to keep everything down after boot > - Other things I have not thought? > > Thanks! > Julien > imho: leave hast where it is, go for zfs replication. will save your butt, sooner or later if you avoid this fragile combination