From owner-freebsd-arch Thu Jun 27 12: 1:18 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 9F5C637B411 for ; Thu, 27 Jun 2002 12:00:40 -0700 (PDT) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.4/8.12.3) with ESMTP id g5RJ0XT4000412; Thu, 27 Jun 2002 12:00:33 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.4/8.12.3/Submit) id g5RJ0X4w000411; Thu, 27 Jun 2002 12:00:33 -0700 (PDT) (envelope-from dillon) Date: Thu, 27 Jun 2002 12:00:33 -0700 (PDT) From: Matthew Dillon Message-Id: <200206271900.g5RJ0X4w000411@apollo.backplane.com> To: "Gary Thorpe" Cc: arch@FreeBSD.ORG Subject: Re: Larry McVoy's slides on cache coherent clusters References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :I think this ties in to Mr. Lambert's question about the future of FreeBSD :very much. I think the NUMA model will eventually dominate all future large :systems in the next 10 years (and SMP will come to be standard on small :systems) and FreBSD will probably have to run efficiently on them to compete :with Linux etc. Having seemless clusters (by this I mean clusters that work :as a single system with one system image and identity) would probably be a :an interesting problem also, since only a few OSes have made any serious :attempt at implementing them. PVM, MPI, and MOSIX cannot for example migrate :I/O among machines (network load balancing maybe?). Well, I'm not so sure. I think partitioning will come to dominate all future large machines in the future. They may well be NUMA, but NUMA will be relegated to the role of being simply a faster communications medium. We will certainly see cache-coherent shared memory across the network used in major ways (we see primitive versions of this now) because their relative costs will be cheaper then NUMA (and will always be cheaper then NUMA). The distinction is important from the point of view of OS design. Even in NUMA systems the difference between local and remote memory is too great for a non-deterministic implementation (which is essentially what Linux has) and does not mesh well with the uniform cache architecture implemented by Linux, Solaris, BSD, etc... most modern OSs. The natural conclusion is to partition instead and develop more formalized, deterministic, network-transportable mechanisms for sharing data that can be abstracted out using mmap(). NUMA then becomes just another, faster transport mechanism. That is the direction I believe the BSDs will take... transparent clustering with NUMA transport, network transport, or a hybrid of both. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message