Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 28 Nov 1999 16:20:02 +0100
From:      Eivind Eklund <eivind@FreeBSD.ORG>
To:        Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
Cc:        cluster@FreeBSD.ORG
Subject:   Re: Homepage
Message-ID:  <19991128162002.F53832@bitbox.follo.net>
In-Reply-To: <19991126225814.I35499@daemon.ninth-circle.org>; from asmodai@wxs.nl on Fri, Nov 26, 1999 at 10:58:14PM %2B0100
References:  <19991126225814.I35499@daemon.ninth-circle.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 26, 1999 at 10:58:14PM +0100, Jeroen Ruigrok/Asmodai wrote:
> Hi,
> 
> just a few people on the list yet, but I just wanted to ask for material
> to put on the project's webpages which I am going to maintain.
> 
> I cc:'d two of my co-workers, feel free to drop them from future
> replies.  Patrick and Guillaume, the list is freebsd-cluster,
> subscribable through majordomo@freebsd.org.
> 
> The current URL is: http://lucifer.bart.nl/~asmodai/projects/Hydra
> 
> Current project name has been set to Hydra - The *BSD Cluster Project
> I hope you guys like the subtitle which you can see on the page. ;)

I am of two minds about it - one way, I think it is extremely cool.
In another way, I think it doesn't really represent most of what we'll
do.

Another possibility (inspired by yours):
We restart processes faster than you can kill hardware!

> So basically this is a call for papers/submissions and to slowly start
> the gruntwork of defining things we want and not want in this project.
> Some things might end up being defined as a sub-project or a totally
> different project.  It is in the best interest that after the gruntwork
> is done that we gather as much interest from able persons as possible,
> e.g. mail -net, -arch about it.

-net is probably not too relevant.  -hackers is relevant, and -arch is
relevant, at least.

> And possibly ask -security about safe ways to ensure spoofing or
> other attacks can be kept to a minimum.

I don't think security is a large part of this.  When you are building
a cluster, you usually include a so called System Area Network (SAN)
that run only inside your cluster.  This is assumed to be secure (just
as your SCSI buses and memory bus are assumed to be secure).

We might want to go beyond that to support different distributed
systems models, but I don't think it needs to be a priority.

> Thoughts and/or opinions thus far?

We need to define the way we think we can walk.  I see a number of
things we will want to do (this text is intended to be in such a form
that you can just snarf it, do slight polish, and use it on the web
pages):

1. Give FreeBSD support for process migration.  I think the way to do
   this is to integrate the MOSIX patches, and then go from there.  As
   we have the BSD/OS patches, I think this should be rather simple -
   it will require some work, but not really large amounts.  Guess:
   About three days to get the patches ported over, and quite a bit
   more (couple of weeks, or so) to get the stuff up and running.
   This assumes that the userland side proves easy.

2. Integrate base technology other people have already used for
   clustering.  This includes (at least) a DLM, the v9fs of Ron
   Minnich, and probably one of the shared distributed memory packages
   out there.  It'd be nice to get the SCI drivers, too.

3. Set up a system for maintaining heartbeats throughout the cluster.
   This is something that use as close as possible to all available
   links between the nodes in order to track which part of a cluster
   is dead when something goes wrong.  We should look at available
   implementations, and see what we can do from there.  A heartbeat
   system normally also provide some way of sending messages to the
   nodes that are alive, but with little guarantee about bandwidth
   (one of the links one use is commonly a serial line).

4. Add support for re-writing ARP addresses.  This allow instant
   failover when a machine dies.

5. Add one or more file systems that can be shared between machines -
   this can be replicated filesystems, or shared access filesystems
   (like HAS-FS).

6. Design a system for tracking services, including what resources
   they need and how they are supposed to fail over.  Some of the
   requirements for such a system to be used as the startup system for
   FreeBSD is available at http://www.freebsd.org/~eivind/newrc.html.
   Another good resource is the TruCluster documentation, as
   TruCluster has a very simple implementation of services.  This
   shows the minimum we need to support in order to get it up and
   working in practice.
   The third thing we're farily certain we should look at is the
   implementation of service groups in IBM's RS/6000 Cluster support
   ("Phoenix").

7. Implement any other technology necessary to make it easy to create
   clusters; this will probably involve defining as we go, and looking
   a lot at what other people have done.  I believe the parts
   mentioned above are mostly mandatory, at least.

Eivind.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-cluster" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991128162002.F53832>