From owner-freebsd-hackers@FreeBSD.ORG Wed Mar 19 20:48:43 2008 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 00B83106567D for ; Wed, 19 Mar 2008 20:48:42 +0000 (UTC) (envelope-from erikt@midgard.homeip.net) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 51C208FC26 for ; Wed, 19 Mar 2008 20:48:42 +0000 (UTC) (envelope-from erikt@midgard.homeip.net) Received: from c83-253-25-183.bredband.comhem.se ([83.253.25.183]:49994 helo=falcon.midgard.homeip.net) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1Jc4yI-0002Uu-4o for freebsd-hackers@freebsd.org; Wed, 19 Mar 2008 21:33:15 +0100 Received: (qmail 93953 invoked from network); 19 Mar 2008 21:33:11 +0100 Received: from owl.midgard.homeip.net (10.1.5.7) by falcon.midgard.homeip.net with ESMTP; 19 Mar 2008 21:33:11 +0100 Received: (qmail 71354 invoked by uid 1001); 19 Mar 2008 21:33:11 +0100 Date: Wed, 19 Mar 2008 21:33:11 +0100 From: Erik Trulsson To: Chuck Robey Message-ID: <20080319203311.GA71206@owl.midgard.homeip.net> Mail-Followup-To: Chuck Robey , Jeremy Chadwick , FreeBSD-Hackers References: <47DF1045.6050202@chuckr.org> <20080318082816.GA74218@eos.sc1.parodius.com> <47E146F9.5060105@chuckr.org> <20080319172213.GA28075@eos.sc1.parodius.com> <47E1558A.2030107@chuckr.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47E1558A.2030107@chuckr.org> User-Agent: Mutt/1.5.17 (2007-11-01) X-Originating-IP: 83.253.25.183 X-Scan-Result: No virus found in message 1Jc4yI-0002Uu-4o. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1Jc4yI-0002Uu-4o 98519d252e1915d7637ad8c0580b7360 Cc: FreeBSD-Hackers , Jeremy Chadwick Subject: Re: remote operation or admin X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Mar 2008 20:48:48 -0000 On Wed, Mar 19, 2008 at 02:03:54PM -0400, Chuck Robey wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Jeremy Chadwick wrote: > > On Wed, Mar 19, 2008 at 01:01:45PM -0400, Chuck Robey wrote: > >> What is most important in my considerations are, how might it to possible > >> to stretch our present smp software to be able to extend the management > >> domains to cover multiple computers? Some sort of a bridge here, because > >> there is no software today (that I'm awarae of, and that sure leaves a huge > >> set of holes) that lets you manage the cores as separate computers) so that > >> maybe today I might be able to have an 8 or 10 core system, and maybe > >> tomorrow look at the economic and software possibility of having a 256 core > >> system. I figure that there would need to be some tight reins on latency, > >> and you would want some BIGTIME comm links, I dunno, maybe not be able to > >> use even Gigabit ethernet, maybe needing some sort of scsi bus linkage, > >> something on that scale? Or, is Fiber getting to that range yet? > >> > >> Anyhow, is it even remotely posible for us to be able to strech our present > >> SMP software (even with it's limitation on word size to limit the range to > >> 32 processors) to be able to jump across machines? That would be one hell > >> of a huge thing to consider, now wouldn't it? > > > > Ahh, you're talking about parallel computing, "clustering", or "grid > > computing". The Linux folks often refer to an implementation called > > Beowulf: > > > > http://en.wikipedia.org/wiki/Beowulf_%28computing%29 > > > > I was also able to find these, more specific to the BSDs: > > > > http://www.freebsd.org/advocacy/myths.html#clustering > > http://lists.freebsd.org/pipermail/freebsd-cluster/2006-June/000292.html > > http://people.freebsd.org/~brooks/papers/bsdcon2003/fbsdcluster/ > > > > Well, I am, and I'm not, if you could answer me one quiestion, then I would > probably know for sure. What is the difference between our SMP and the > general idea of clustering, as typified by Beowulf? I was under the > impression I was talking about seeing the possibility of moving the two > closer together, but maybe I'm confused in the meanings? The short version is that software written for SMP and for clusters make very different assumptions on what operations are available and the relative costs between them. Software written for one of them will typically either not run at all, or run very inefficiently on the other. Longer version: SMP (Symmetric Multi Processing) refers to a situation where all the CPUs involved are 'equal' and all use the same shared physical memory. In an SMP system it does not really matter which CPU you run a program on, since they are all equal and all have the same access to memory. One important feature of such a system is that when one CPU writes to memory all the others can see that write. A close relative of SMP is NUMA (Non-Uniform Memory Access), with the most popular variant being ccNUMA (cache coherent NUMA). Here all CPUs still share the memory, but different parts of memory can be differently expensive to access depending on which CPU is involved. (For example: CPU 1 might have fast access to memory area A, and slow access to memory area B, while CPU 2 have fast access to B, and slow access to A.) (Many multi-CPU machines in use today are actually ccNUMA even if they are often called SMP. Software written for SMP will typically run unmodified on an ccNUMA system, although perhaps with somewhat suboptimal performance.) In a cluster on the other hand the CPUs do not share physical memory. They cannot automatically see each others memory operations. Any communication between the CPUs must take place over the network (which is much slower than the internal buses inside a computer.) So in an SMP system communication between CPUs is fast, switching which CPU is running a given program is a cheap and simple operation, and much of the work in synchronizing the CPUs is taken care of automatically by the hardware. In a cluster communication between nodes is expensive, transferring a program from one CPU to another is slow and complicated, and one needs to do extra work to keep each CPU aware of what the others are doing. In a clustered system one usually also has to take care of the case that one node crashes or the network connection to it is broken. In typical SMP systems one just ignores the possibility of one CPU crashing. (The exception being some mainframe systems with high redundancy where one can replace almost all components (including CPUs) while the machine is running. There is a reason why these cost lots and lots of money.) A system that is written to work in a clustered environment can fairly easily be moved to run on an SMP machine, but it will do a lot of work that is not necessary under SMP and thus not make very good use of the hardware. Moving from SMP to cluster is more difficult. One can emulate the missing hardware support in software, but this has a very high overhead. Or one can rewrite the software completely, which is a lot of work. FreeBSD is written for SMP systems and makes many assumptions about the capabilities of the underlying hardware. Modifying FreeBSD to run efficiently and transparently on top of a clustered system would be a *huge* undertaking. -- Erik Trulsson ertr1013@student.uu.se