From owner-freebsd-database Thu Mar 12 08:48:03 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id IAA15492 for freebsd-database-outgoing; Thu, 12 Mar 1998 08:48:03 -0800 (PST) (envelope-from owner-freebsd-database@FreeBSD.ORG) Received: from ns1.yes.no (ns1.yes.no [195.119.24.10]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id IAA15455 for ; Thu, 12 Mar 1998 08:47:59 -0800 (PST) (envelope-from eivind@bitbox.follo.net) Received: from bitbox.follo.net (bitbox.follo.net [194.198.43.36]) by ns1.yes.no (8.8.7/8.8.7) with ESMTP id QAA09399; Thu, 12 Mar 1998 16:47:48 GMT Received: (from eivind@localhost) by bitbox.follo.net (8.8.6/8.8.6) id RAA01154; Thu, 12 Mar 1998 17:47:47 +0100 (MET) Message-ID: <19980312174747.57249@follo.net> Date: Thu, 12 Mar 1998 17:47:47 +0100 From: Eivind Eklund To: shimon@simon-shapiro.org, "Robert A. Bruce" , freebsd-database@FreeBSD.ORG Subject: Re: Fault tolerance issues References: <199803121428.GAA20032@pike.cdrom.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.89.1i In-Reply-To: ; from Simon Shapiro on Thu, Mar 12, 1998 at 08:16:47AM -0800 Sender: owner-freebsd-database@FreeBSD.ORG Precedence: bulk On Thu, Mar 12, 1998 at 08:16:47AM -0800, Simon Shapiro wrote: > 2. High Availability. AKA HAS, High Availability Server. A set of > features that allow a computer to continue and provide service with no > loss of data and only a brief interruption of service, in the fase of a > single failure. > > HAS are typically said to be SPOF (Single Point Of Failure) free. They are > designed to have the ability to tolerate any single component falure. I think that definition is fairly useless. Define city as single component, hit city with atomic bomb - booom, failure. OTOH, cvsup is to some degree SPOF-free - if any single continent is wiped out, my changes to FreeBSD will still persist :-) Higher Availability than that is probably only of academic interest. I actually think it would be better to talk about a system having features for High Availability than talking about it 'being HA'. Some features: Redundancy - anything that can fail with an interesting probability (and remember, you often have many deployed systems) should have solutions that can automatically take over functionality. Non-interference - anything that is temporarily disconnected will not interfere with the part that take over. Quick switchover - switchover to backup solutions should happen automatically and quickly once an error is detected. Error detection - errors should be detected quickly, automatically, and consistently. Quick restart - if something goes down, it should not need a long time to restart. fsck is right out. At a theoretical level, I think these are probably the main points. We can then start discussing what interesting subsystems FreeBSD has, and what can be done to provide these features for each of the subsystems. > If there is interest, we can start a discussion on what such a computer > looks like. That could be interesting, though if we really want this to be fruitful we should (at some point not too far into the future) start focusing on making at least TODO-list, and probably a few designs. Eivind. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-database" in the body of the message