From owner-freebsd-stable@FreeBSD.ORG Wed Apr 6 12:55:52 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 39746106566B for ; Wed, 6 Apr 2011 12:55:52 +0000 (UTC) (envelope-from petefrench@ingresso.co.uk) Received: from constantine.ingresso.co.uk (constantine.ingresso.co.uk [IPv6:2001:470:1f09:176e::3]) by mx1.freebsd.org (Postfix) with ESMTP id 031668FC0C for ; Wed, 6 Apr 2011 12:55:52 +0000 (UTC) Received: from dilbert.london-internal.ingresso.co.uk ([10.64.50.6] helo=dilbert.ticketswitch.com) by constantine.ingresso.co.uk with esmtps (TLSv1:AES256-SHA:256) (Exim 4.73 (FreeBSD)) (envelope-from ) id 1Q7SH0-000I9z-K0; Wed, 06 Apr 2011 13:55:50 +0100 Received: from petefrench by dilbert.ticketswitch.com with local (Exim 4.74 (FreeBSD)) (envelope-from ) id 1Q7SH0-0004lT-JB; Wed, 06 Apr 2011 13:55:50 +0100 To: daniel@digsys.bg, freebsd-stable@freebsd.org In-Reply-To: <4D9B2EA2.9020700@digsys.bg> Message-Id: From: Pete French Date: Wed, 06 Apr 2011 13:55:50 +0100 Cc: Subject: Re: ZFS HAST config preference X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Apr 2011 12:55:52 -0000 > My original idea was to set up blades so that they run HAST on pairs of > disks, and run ZFS in number of mirror vdevs on top of HAST. The ZFS > pool will exist only on the master HAST node. Let's call this setup1. This is exactly how I run things. Personally I think it is the best solution, as ZFS then knows about the mirroring of the drives, always a good thing, and you geta ZFS filesystem on the top. I also enable compression to reduce the bandwidth to the drives, as tha reduces the data flowing across the network. For what I am doing (running mysql on top) this is actually faster for both selects and inserts. test with your own application first though. > Or, I could use ZFS volumes and run HAST on top of these. This means, > that on each blade, I will have an local ZFS pool. Let's call this setup2. ...you would need to put a filesystem on to of the HAST filesystem though, what would that be ? > While setup1 is most straightforward, it has some drawbacks: > - disks handled by HAST need to be either identical or have matching > partitions created; This is true. I run identical machines as primary and secondary. > - the 'spare' blade would do nothing, as it's disk subsystem will be > gone as long as it is HAST slave. As the blades are quite powerful (4x8 > core AMD) that would be wasteful, at least in the beginning. If you are keeping a machine as a hot spare then thats something you just have to live with in my opinion. I've run this way for several years - before HAST and ZFS we used gmirror and UFS to do the same thing. It does work very nicely, but you do end up with a machine idle. > HAST replication speed should not be an issue, there is 10Gbit network > between the blade servers. I actually ut separate ether interfaces for each of the hast drives. So for 2 drives there are 2 space ether ports on the machine, with a cible between them, deidcated to just that drive. Those are gigabit cards though, not 10 gig. > Has anyone already setup something similar? What was the experience? very good actually. One thing I would say is to write and test a set of scvripts to do the failover - avoids shooting yourself in the foot when trying to do the commands by hand (which is rather easy to do). I have a script to make the orimary into a secondary, and one to d the reverse. The first scipt waits unitl the HAST data is all flushed before changing role, and makes sure the services are stoped, pool exported before ripping out the disc from underneath. The script also handles removing a shared IP address from the interface. -pete.