From owner-freebsd-fs@FreeBSD.ORG  Wed Jun 24 22:35:26 2009
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A53271065687
	for <freebsd-fs@freebsd.org>; Wed, 24 Jun 2009 22:35:26 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: from mail-gx0-f210.google.com (mail-gx0-f210.google.com
	[209.85.217.210])
	by mx1.freebsd.org (Postfix) with ESMTP id 3760E8FC0A
	for <freebsd-fs@freebsd.org>; Wed, 24 Jun 2009 22:35:25 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: by gxk6 with SMTP id 6so746097gxk.19
	for <multiple recipients>; Wed, 24 Jun 2009 15:35:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:date:message-id:subject
	:from:to:cc:content-type;
	bh=e3BnrEIUyUqPPe2BsqO6pKiabbFDJEUabBG+EL4ZYGE=;
	b=IHl4nAhKa+L7zuiUYJ0hiiQd18Ui8kxdC0GeKELvoN8mh0GCmpfC/28SyU76Pw1l5p
	8k5VUZ3yYpsua3rXR6SaEujd01TvKiWDGHsYIGtHqLs+lynVhaZxH1hqWp3rypcnmoS5
	UR8LuVS5RmhjLZDwa1vVpTe/I35BCQ8yG9ZOg=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:cc:content-type;
	b=EvYbVaEwJKfEyE/tRsnDTpDVzYnuya2DnAZw19BOXA9Uu4HLE0REIq3rN5zHVPCH65
	uFbpXlpRdeMXzYZg4ACGNQTeBDhCI8IiMOIfRgqCimAbTYx/UU8XgxbyJqVWH+tyG4c/
	qnT00v1+bcDLx0T00wIRUG2IRCo/5TRPqGD+M=
MIME-Version: 1.0
Received: by 10.151.119.1 with SMTP id w1mr3410972ybm.198.1245882925169; Wed, 
	24 Jun 2009 15:35:25 -0700 (PDT)
Date: Wed, 24 Jun 2009 15:35:25 -0700
Message-ID: <b269bc570906241535u59475cf2hdfe225206cf34294@mail.gmail.com>
From: Freddie Cash <fjwcash@gmail.com>
To: freebsd-cluster@freebsd.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-fs@freebsd.org
Subject: Fail-over SAN setup: ZFS, NFS, and ...?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Jun 2009 22:35:26 -0000

[Not exactly sure which ML this belongs on, as it's related to both
clustering and filesystems.  If there's a better spot, let me know and I'll
update the CC:/reply-to.]

We're in the planning stages for building a multi-site, fail-over SAN setup
which will be used to provide redundant storage for a virtual machine setup.
 The setup will be like so:
   [Server Room 1]      .      [Server Room 2]
  -----------------     .    -------------------
                        .
  [storage server]      .     [storage server]
          |             .             |
          |             .             |
   [storage switch]     .      [storage switch]
                 \----fibre----/      |
                        .             |
                        .             |
                        .   [storage aggregator]
                        .             |
                        .             |
                        .     /---[switch]---\
                        .     |       |      |
                        .     |   [VM box]   |
                        .     |       |      |
                        .  [VM box]   |      |
                        .     |       |  [VM box]
                        .     |       |      |
                        .     [network switch]
                        .             |
                        .             |
                        .         [internet]

Server room 1 and server room 2 are on opposite ends of town (about 3 km)
with a dedicated, direct-link, fibre link between them.  There will be a set
of VM boxes at each site, that use the shared storage, and will act as
fail-over for each other.  In theory, only 1 server room would ever be
active at a time, although we may end up migrating VMs between the two sites
for maintenance purposes.

We've got the storage server side of things figured out (5U rackmounts with
24 drive bauys, using FreeBSD 7.x and ZFS).  We've got the storage switches
picked out (HP Procurve 2800 or 2900, depending on if we go with 1 GbE or 10
GbE fibre links between them).  We're stuck on the storage aggregator.

For a single aggregator box setup, we'd use FreeBSD 7.x with ZFS.  The
storage servers would each export a single zvol using iSCSI.  The storage
aggregator would use ZFS to create a pool using a mirrored vdev.  To expand
the pool, we put in two more storage servers, and add another mirrored vdev
to the pool.  No biggie.  The storage aggregator then uses NFS and/or iSCSI
to make storage available to the VM boxes.  This is the easy part.

However, we'd like to remove the single-point-of-failure that the storage
aggregator represents, and have a duplicate of it running at Server Room 1.
 Right now, we can do this using cold-spares that rsync from the live box
every X hours/days.   We'd like this to be a live, fail-over spare, though.
 And this is where we're stuck.

What can we use to do this?  CARP?  Heatbeat?  ggate?  Should we look at
Linux with DRBD or linux-ha or cluster-nfs or similar?  Perhaps RedHat
Cluster Suite?  (We'd prefer not to, as then storage management becomes a
nightmare again, requiring mdadm, lvm, and more.)  Would a cluster
filessytem be needed?  AFS or similar?

We have next to no knowledge of fail-over clustering when it comes to
high-availability and fail-over.  Any pointers to things to read online, or
tips, or even "don't do that, you're insane" comments greatly appreciated.
 :)

Thanks.
-- 
Freddie Cash
fjwcash@gmail.com