From owner-freebsd-cluster Thu Dec 12 1:50:24 2002 Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B770037B401 for ; Thu, 12 Dec 2002 01:50:21 -0800 (PST) Received: from gate.nentec.de (gate2.nentec.de [194.25.215.66]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5BD6D43EC5 for ; Thu, 12 Dec 2002 01:50:20 -0800 (PST) (envelope-from sporner@nentec.de) Received: from nenny.nentec.de (root@nenny.nentec.de [153.92.64.1]) by gate.nentec.de (8.11.3/8.9.3) with ESMTP id gBC9oIP05092; Thu, 12 Dec 2002 10:50:18 +0100 Received: from nentec.de (andromeda.nentec.de [153.92.64.34]) by nenny.nentec.de (8.11.3/8.11.3) with ESMTP id gBC9oCt05933; Thu, 12 Dec 2002 10:50:13 +0100 Message-ID: <3DF85BD4.1050200@nentec.de> Date: Thu, 12 Dec 2002 10:50:12 +0100 From: Andy Sporner User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2a) Gecko/20020910 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Michael Grant Cc: freebsd-cluster@FreeBSD.ORG Subject: Re: sharing files within a cluster References: <200212112225.gBBMPi411534@splat.grant.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by AMaViS-perl11-milter (http://amavis.org/) Sender: owner-freebsd-cluster@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Michael Grant wrote: >Anyone know about this "single image linux cluster"? How's that done? >It seems to me that there would need to be something local to each >box, minimally /etc, no? > > Don't know much about the Linux single image stuff. I see dependancies on RPC's and I get scared... In a nutshell, what I am looking for in a single image is very NUMA like with the exception that instead of being a single image (one OS) it is a cooperative image(many kernels, one execution space). The only thing that is "single image" is the VM that each machine exports to the single image. I think it is possible architecturally to do this while minimizing the amount of work that has to be done by the master node. IE: when a process starts on a node, the PID is a multiple of the node number, same thing with file handles, sockets and such. >Andy, I agree with you that having a common process space and the >ability to migrate processies across machines would be a big win. If >that could be done on a freebsd cluster along with a single image, >that would definitely be a reason to use a freebsd cluster over a >linux cluster. >\ > Thanks! I agree as well. >Realistically, how close are you to this? > It's really a matter of motivation ;-) If I am having fun, the time is short, if I am not, things tend to take an eternity. I have my main fun at work developing high speed networking hardware and I have little time or tolerance for strange people--especially those who don't like to reply to email, where I could normally expect one from. I don't mean necessarily our own superstars. Lately I have come to the conclusion that organized computing has no future because of the influx of newbies (which are not bad--it's just now that if you are new on a list the veterans regard you as a newbie and this is a barrier to normal operation and cooperation). I make some offers only once and if there are no takers, I don't regard it as a worthwhile thing to do. I had intended to look into porting BPROC at some time, but it is now off the radar screen. (though not far enough off to learn how they swap processes ;-)) With that little flame out of the way ;-) Here is the rough list of things that must be accomplished to realize the goal. Naturally the more people working on it, the faster it goes because I only spend what spare time I have on this. Here is what has been done: 1. Cluster configuration and monitor (build 114) for failover. (supports distrubuted process table views. IE: there is NO master process table). Here is what must be done: 1. Front end load balancing (later this is the gateway for network based processes that have to move arround the cluster with their sockets.) Current plans are Dec 31 (will use initially IPFILTER). 2. The message infrastructure. I have so far studied BPROC and DRDB. I think it would be a weeks worth of work to make a messaging system that takes both requirements in account. I would like to use the embedded TCP stack that I wrote for my current project, but that's Intellectual Property :-( It's just the kind of thing we need (event driven with lot's of hooks for callouts). But there is access to the stack pretty good inside. Current Plans Jan 15. 3. Process swapping engine. I would like to extend the page swapping architecture (thank god FreeBSD has made this very modular!) I regard this as a better mechanism than what I have seen in for instance BPROC. That way shared memory can be supported too. This is the most difficult part and hard to put a time estimate on. It would be very helpful to team up with people who know well the VM architecture good. It could be as early as March or April for this one. 4. Much testing! That's I think a distributed affair so this can be very short perhaps. Assumptions: 1. Use of some sort of shared filesystem for the machines that are part of the domain since this would be a distributed image. 2. Networked based (sockets) must pass through a front-end device to be directed to the node that owns the process. By nature of the Hi-AV failover stuff this would NOT be a single point of failure. Executive Summary: probably early summer. But I think it would take that long anyways to port the other stuff that is also available. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-cluster" in the body of the message