From owner-freebsd-stable@FreeBSD.ORG Wed Mar 24 23:45:32 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 29CCF106564A for ; Wed, 24 Mar 2010 23:45:32 +0000 (UTC) (envelope-from michal@ionic.co.uk) Received: from mail1.sharescope.co.uk (pm1.ionic.co.uk [85.159.80.19]) by mx1.freebsd.org (Postfix) with ESMTP id DBECC8FC17 for ; Wed, 24 Mar 2010 23:45:31 +0000 (UTC) Received: from localhost (unknown [127.0.0.1]) by mail1.sharescope.co.uk (Postfix) with ESMTP id 5AC85FC0AB for ; Wed, 24 Mar 2010 23:45:32 +0000 (UTC) X-Virus-Scanned: amavisd-new at sharescope.co.uk Received: from mail1.sharescope.co.uk ([127.0.0.1]) by localhost (mail1.sharescope.co.uk [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KnDqwol4+5dd for ; Wed, 24 Mar 2010 23:45:28 +0000 (GMT) Received: from [192.168.2.37] (office.ionic.co.uk [85.159.85.2]) (Authenticated sender: chris@sharescope.co.uk) by mail1.sharescope.co.uk (Postfix) with ESMTPSA id 1A84AFC0A2 for ; Wed, 24 Mar 2010 23:45:28 +0000 (GMT) Message-ID: <4BAAA415.1000804@ionic.co.uk> Date: Wed, 24 Mar 2010 23:45:25 +0000 From: Michal User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.8) Gecko/20100227 Lightning/1.0b1 Thunderbird/3.0.3 MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <4BAA3409.6080406@ionic.co.uk> In-Reply-To: X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: Multi node storage, ZFS X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Mar 2010 23:45:32 -0000 On 24/03/2010 22:19, Ivan Voras wrote: > > For what it's worth - I think this is a good idea! iSCSI and ZFS make it > extraordinarily flexible to do this. You can have a RAIS - redundant > array of inexpensive servers :) > > For example: each server box hosts 8-12 drives - use a hardware > controller with RAID6 and a BBU to create a single volume (if FreeBSD > booting issues allow, but that can be worked around). Export this volume > via iSCSI. Repeat for the rest of the servers. Then, on the client, > create a RAIDZ. or if you trust your setup that much. a straight striped > ZFS volume. If you do it the RAIDZ way, one of your storage servers can > fail completely. > > As you need more space, add more servers in batches of three (if you did > RAIDZ, else the number doesn't matter), add them to the client as usual. > > The "client" in this case can be a file server, and you can achieve > failover between several of those by using e.g. carp, heartbeat, etc. - > if the master node fails, some other one can reconstitute the ZFS pool > ad make it available. > > But, you need very fast links between the nodes, and I wouldn't use > something like this without extensively testing the failure modes. > I do aswell :D The thing is, I see it two ways; I worked for a a huge online betting company, and we had the money for HP MSA's and big expensive SAN's, then we have a lot of SMB's with no where near the budget for that but the same problem with lots of data and the need for backend storage for databases. It's all well and good having 1 ZFS server, but it's fragile in the the sense of no redundancy, then we have 1 ZFS server and a 2nd with DRBD, but that's a waste of money...think 12 TB, and you need to pay for another 12TB box for redundancy, and you are still looking at 1 server. I am thinking a cheap solution but one that has IO throughput, redundancy and is easy to manange and expand across multiple nodes A "NAS" based solution...one based on a single NAS device which has single targets //nas1 //nas2 etc is ok, but has many problems. A "SAN" based solution can overcome these, it does add cost, but the amount can be minimised. I'll work on it over the next few days and get some notes typed up as well as some run some performance numbers. I'll try and do it modular by adding more RAM and sorting our ZLS and cache, comparing how they effect performance