From owner-freebsd-isp@FreeBSD.ORG  Mon Sep 26 21:39:36 2005
Return-Path: <owner-freebsd-isp@FreeBSD.ORG>
X-Original-To: freebsd-isp@freebsd.org
Delivered-To: freebsd-isp@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3541716A41F;
	Mon, 26 Sep 2005 21:39:36 +0000 (GMT)
	(envelope-from b.candler@pobox.com)
Received: from orb.pobox.com (orb.pobox.com [207.8.226.5])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B6C4E43D48;
	Mon, 26 Sep 2005 21:39:35 +0000 (GMT)
	(envelope-from b.candler@pobox.com)
Received: from orb (localhost [127.0.0.1])
	by orb.pobox.com (Postfix) with ESMTP
	id 8C4871DDA; Mon, 26 Sep 2005 17:39:56 -0400 (EDT)
Received: from billdog.local.linnet.org
	(dsl-212-74-113-66.access.uk.tiscali.com [212.74.113.66])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id 30976A2;
	Mon, 26 Sep 2005 17:39:54 -0400 (EDT)
Received: from brian by billdog.local.linnet.org with local (Exim 4.50
	(FreeBSD)) id 1EK0kf-0000Cx-U9; Mon, 26 Sep 2005 22:43:09 +0100
Date: Mon, 26 Sep 2005 22:43:09 +0100
From: Brian Candler <B.Candler@pobox.com>
To: Isaac Levy <ike@lesmuug.org>
Message-ID: <20050926214309.GA766@uk.tiscali.com>
References: <20050924141025.GA1236@uk.tiscali.com>
	<E7A2AE04-87DC-4F3A-87DE-97CD5B51E60F@lesmuug.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <E7A2AE04-87DC-4F3A-87DE-97CD5B51E60F@lesmuug.org>
User-Agent: Mutt/1.4.2.1i
Cc: freebsd-isp@freebsd.org, freebsd-cluster@freebsd.org
Subject: Re: Options for synchronising filesystems
X-BeenThere: freebsd-isp@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Internet Services Providers <freebsd-isp.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-isp>,
	<mailto:freebsd-isp-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-isp>
List-Post: <mailto:freebsd-isp@freebsd.org>
List-Help: <mailto:freebsd-isp-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-isp>,
	<mailto:freebsd-isp-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Sep 2005 21:39:36 -0000

On Mon, Sep 26, 2005 at 02:16:31PM -0400, Isaac Levy wrote:
> I just wanted to throw out some quick thoughts on a totally different  
> approach which nobody has really explored in this thread

Geom (gmirror plus ggated/ggatec) was what I suggested for syncing two NFS
servers (my option 2) or for direct synchronisation of the clients'
filesystems to the servers (my option 4). The problem occurs when a client
actually *mounts* and uses the mirrored copy, rather than just keeping a
mirrored copy for resilience.

> Geom Gate:
> http://kerneltrap.org/news/freebsd?from=20
> 
> Network device-level client/server disk mapping tool.
> (VERY IMPORTANT COMPONENT, it's reportedly faster, and more stable  
> than NFS has ever been- so people have immediately and happily  
> deployed it in production systems!)

NFS and geom gate are two different things, so you can't really compare them
directly. NFS shares files; geom gate shares a block level device. With NFS
you can have one server and multiple clients, and the clients can access
this filesystem read-write. With geom gate, you just have remote access to a
disk partition, and essentially can only do what you could do with a local
block device.

Incidentally, NFS has been *hugely* dependable for me in production
environments. However I've always used expensive and beefy NFS servers
(Netapp) whilst FreeBSD is just the client.

> I know of one web-shop in Canada, which is running 2 machines for  
> every virtual cluster, in the following configuration:
> 
> 2 servers,
> 4 SATA drives per box,
> quad copper/ethernet gigabit nic on each box
> 
> each drive is mirrored using gmirror, over each of the gigabit  
> ethernet nics
> each box is running Vinum Raid5 across the 4  mirrored drives
> 
> The drives are then sliced appropriately, and server resources are  
> distributed across the boxes- with various slices mounted on each box.
> The folks I speak of simply have a suite of failover shell scripts  
> prepared, in the event of a machine experiencing total hardware failure.

Right. But unless I'm mistaken, the remote mirrors are just backup copies of
the data. Those remote mirrors are not actually *mounted* as filesystems.

I think you're talking about a master/slave failover scenario. With careful
arrangement, machine 1 can be master for dataset A and slave for dataset B,
while machine 2 is slave for A and master for B, so you're not wasting your
second machine. If machine 1 fails, machine 2 can take over both datasets.
That's fine.

However, what I need is for dataset A to be generated on machine 1 and
identical copies available on machines 2, 3, 4, 5...9. Not just *stored*
there, but actually *used* there, as live read-only copies. So if machine 1
makes a change to the dataset, all the other machines notice the change
properly and start using it immediately.

>From what I've heard, I can't use gmirror from machine 1 to machines 2-9,
because you can't mount a filesystem readonly while some other machine
magically updates the blocks from under its nose. The filesystem gets
confused because its local caches of blocks and inodes become out of date
when the data in the block device changes.

> Regardless, it's worth tapping into the GEOM dialogues

GEOM is definitely cool, and a strong selling point for moving from 4.x to
5.x

Regards,

Brian.