From owner-freebsd-fs@FreeBSD.ORG Wed Jun 24 22:35:26 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A53271065687 for ; Wed, 24 Jun 2009 22:35:26 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-gx0-f210.google.com (mail-gx0-f210.google.com [209.85.217.210]) by mx1.freebsd.org (Postfix) with ESMTP id 3760E8FC0A for ; Wed, 24 Jun 2009 22:35:25 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: by gxk6 with SMTP id 6so746097gxk.19 for ; Wed, 24 Jun 2009 15:35:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:cc:content-type; bh=e3BnrEIUyUqPPe2BsqO6pKiabbFDJEUabBG+EL4ZYGE=; b=IHl4nAhKa+L7zuiUYJ0hiiQd18Ui8kxdC0GeKELvoN8mh0GCmpfC/28SyU76Pw1l5p 8k5VUZ3yYpsua3rXR6SaEujd01TvKiWDGHsYIGtHqLs+lynVhaZxH1hqWp3rypcnmoS5 UR8LuVS5RmhjLZDwa1vVpTe/I35BCQ8yG9ZOg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; b=EvYbVaEwJKfEyE/tRsnDTpDVzYnuya2DnAZw19BOXA9Uu4HLE0REIq3rN5zHVPCH65 uFbpXlpRdeMXzYZg4ACGNQTeBDhCI8IiMOIfRgqCimAbTYx/UU8XgxbyJqVWH+tyG4c/ qnT00v1+bcDLx0T00wIRUG2IRCo/5TRPqGD+M= MIME-Version: 1.0 Received: by 10.151.119.1 with SMTP id w1mr3410972ybm.198.1245882925169; Wed, 24 Jun 2009 15:35:25 -0700 (PDT) Date: Wed, 24 Jun 2009 15:35:25 -0700 Message-ID: From: Freddie Cash To: freebsd-cluster@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Fail-over SAN setup: ZFS, NFS, and ...? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Jun 2009 22:35:26 -0000 [Not exactly sure which ML this belongs on, as it's related to both clustering and filesystems. If there's a better spot, let me know and I'll update the CC:/reply-to.] We're in the planning stages for building a multi-site, fail-over SAN setup which will be used to provide redundant storage for a virtual machine setup. The setup will be like so: [Server Room 1] . [Server Room 2] ----------------- . ------------------- . [storage server] . [storage server] | . | | . | [storage switch] . [storage switch] \----fibre----/ | . | . | . [storage aggregator] . | . | . /---[switch]---\ . | | | . | [VM box] | . | | | . [VM box] | | . | | [VM box] . | | | . [network switch] . | . | . [internet] Server room 1 and server room 2 are on opposite ends of town (about 3 km) with a dedicated, direct-link, fibre link between them. There will be a set of VM boxes at each site, that use the shared storage, and will act as fail-over for each other. In theory, only 1 server room would ever be active at a time, although we may end up migrating VMs between the two sites for maintenance purposes. We've got the storage server side of things figured out (5U rackmounts with 24 drive bauys, using FreeBSD 7.x and ZFS). We've got the storage switches picked out (HP Procurve 2800 or 2900, depending on if we go with 1 GbE or 10 GbE fibre links between them). We're stuck on the storage aggregator. For a single aggregator box setup, we'd use FreeBSD 7.x with ZFS. The storage servers would each export a single zvol using iSCSI. The storage aggregator would use ZFS to create a pool using a mirrored vdev. To expand the pool, we put in two more storage servers, and add another mirrored vdev to the pool. No biggie. The storage aggregator then uses NFS and/or iSCSI to make storage available to the VM boxes. This is the easy part. However, we'd like to remove the single-point-of-failure that the storage aggregator represents, and have a duplicate of it running at Server Room 1. Right now, we can do this using cold-spares that rsync from the live box every X hours/days. We'd like this to be a live, fail-over spare, though. And this is where we're stuck. What can we use to do this? CARP? Heatbeat? ggate? Should we look at Linux with DRBD or linux-ha or cluster-nfs or similar? Perhaps RedHat Cluster Suite? (We'd prefer not to, as then storage management becomes a nightmare again, requiring mdadm, lvm, and more.) Would a cluster filessytem be needed? AFS or similar? We have next to no knowledge of fail-over clustering when it comes to high-availability and fail-over. Any pointers to things to read online, or tips, or even "don't do that, you're insane" comments greatly appreciated. :) Thanks. -- Freddie Cash fjwcash@gmail.com