From owner-freebsd-database Thu Mar 12 16:06:37 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id QAA03623 for freebsd-database-outgoing; Thu, 12 Mar 1998 16:06:37 -0800 (PST) (envelope-from owner-freebsd-database@FreeBSD.ORG) Received: from sendero.simon-shapiro.org (sendero-fddi.Simon-Shapiro.ORG [206.190.148.2]) by hub.freebsd.org (8.8.8/8.8.8) with SMTP id QAA03616 for ; Thu, 12 Mar 1998 16:06:27 -0800 (PST) (envelope-from shimon@sendero-fxp0.simon-shapiro.org) Received: (qmail 19261 invoked by uid 1000); 12 Mar 1998 20:14:09 -0000 Message-ID: X-Mailer: XFMail 1.3-alpha-030698 [p0] on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <19980312174747.57249@follo.net> Date: Thu, 12 Mar 1998 12:14:09 -0800 (PST) Reply-To: shimon@simon-shapiro.org Organization: The Simon Shapiro Foundation From: Simon Shapiro To: Eivind Eklund Subject: Re: Fault tolerance issues Cc: freebsd-database@FreeBSD.ORG, "Robert A.Bruce" Sender: owner-freebsd-database@FreeBSD.ORG Precedence: bulk On 12-Mar-98 Eivind Eklund wrote: > On Thu, Mar 12, 1998 at 08:16:47AM -0800, Simon Shapiro wrote: >> 2. High Availability. AKA HAS, High Availability Server. A set of >> features that allow a computer to continue and provide service with >> no >> loss of data and only a brief interruption of service, in the fase >> of a >> single failure. >> >> HAS are typically said to be SPOF (Single Point Of Failure) free. They >> are >> designed to have the ability to tolerate any single component falure. > > I think that definition is fairly useless. Define city as single > component, hit city with atomic bomb - booom, failure. Not at all. A Atom bomb tends to damage more than one component in the system :-) You have to separate operational availability from disaster recovery. For most FreeBSD users, a Nuclear bomb will singnal the end of their interest in the system, or its data. A phone company may want to survive a conventional weapons attack by having a second database mirrored in another facility. iregardless of that issue, each facility still wants to have resistance to single component failure. It is pretty well agreed in the industry what non-SPOF means. > OTOH, cvsup is to some degree SPOF-free - if any single continent is > wiped out, my changes to FreeBSD will still persist :-) Higher > Availability than that is probably only of academic interest. Cvsup is not a general purpose computer system. It is not even a general prpose database. It is a concept implemented as an application. This leads to the first ``Y'' in the road: Application Level vs. System Level HAS. Application level HAS is a very viable solution, no doubt, but it suffers from some drawbacks: a. The mechanisms are not directly usable by other applications; The way and means by which cvsup distributes and protects the data has no use for airline ticket reservation system. Only the human understanding of some of the issues may be trasferrable. b. Typically, poor atomicity/resolution palgues such solutions. Cvsup is an exception, but the cvsup model is totally unacceptable for OLTP type work. c. Very long and unreliable checkpoint/restart delays. Again, for cvsup, it matters not. For an on-line ordering/credit-card processing systm, it may not be acceptable. > I actually think it would be better to talk about a system having > features for High Availability than talking about it 'being HA'. I tend to disagree here too. These are two separate, but equally valid discussions. a. What characterizes a High Availability Server (as opposed to a Fault Tolerant one) b. What features are there to implement a HAS. Some of these features can be adopted by NSHAS to various degrees. > Some features: > Redundancy - anything that can fail with an interesting > probability (and remember, you often have many deployed > systems) should have solutions that can automatically take > over functionality. This directly contradicts your Atomic bomb statement from above. Besides, this is an implementation feature. Redundency is not part of high availability definition. It is one part of an implementation solution. Maybe the only one we can think of at this moment. > Non-interference - anything that is temporarily disconnected > will not interfere with the part that take over. This is part of SPOF. If a disk drive fails and takes the SCSI bus with it, we now have TWO failed components; A disk and a bus. But, you are right in this requirement. > Quick switchover - switchover to backup solutions should > happen automatically and quickly once an error is detected. Not necessarily. What if I do not need to switch over at all? If our solution to HA is to switch over, then it needs to be quick, automatic and transparent. But what (how long) is quick? > Error detection - errors should be detected quickly, > automatically, and consistently. We need this one above the other one. First we detect the error, then we decide to switch over :-) > Quick restart - if something goes down, it should not need a > long time to restart. fsck is right out. We do not need quich re-start if we have quick (or no) switchover. If a filesystem can continue to be accessed, in the face of a system crash, then who cares how long it takes the system to re-boot? We care, as in most HAS, we will be in a degraded mode until repairs are perfoprmed. Degraded mode means that we continue processing probably at a reduced rate), but the next failure will cause disruption of service. Fsck assumes Unix filesystems. I am still at a lower and perhaps broader level. Think about raw didsk, tapes, networking, etc. > At a theoretical level, I think these are probably the main points. > We can then start discussing what interesting subsystems FreeBSD has, > and what can be done to provide these features for each of the > subsystems. > >> If there is interest, we can start a discussion on what such a computer >> looks like. > > That could be interesting, though if we really want this to be > fruitful we should (at some point not too far into the future) start > focusing on making at least TODO-list, and probably a few designs. > > Eivind. Absolutely agree. the only exception I take here is that we may want to define the service level, the interruption modes, etc., before we think soluitions. For example, I do not want to assume Unix Filesystem of any kind when I think data storage. You may belive that UFS is just fine. I may think contineous service, you may think resstart. We have to define the services we want to support, their level of ``reliability'', failure modes, etc. Then we come up with a TODO list, then we do it. My bias towards this is to push the HA as far down the stack as I can reasonably get away with. I want to be able to drop as much ``off the shelf'' stuff on top of it. Give you an example; It took the ufs filesystem almost 1/4 of a century to stabilize. It still goes through girations (soft updates) and still incapable of surviving a software crash (panic) with absolute certainty of 100% instant recovery. Veritas can pretty much deliver that, but not UFS. UFS is totally incapable of recovering from ANY hardware failures. Veritas offers a facility that can somewhat survice hardware failures. My point? If I can ``give'' FreeBSD a reliable ``disk'' that looks, tastes, smells, and sounds like a ``normal'' disk, and this ``disk'' guarantees: a) no data loss with a single component failure, and b) transparent and continual availability with one Unix instance loss, I can put ANY disk access method (not only a filesystem) on it, and this method will automagically be reliable, non-lossy, resilinet and highly available. ---------- Sincerely Yours, Simon Shapiro Shimon@Simon-Shapiro.ORG Voice: 503.799.2313 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-database" in the body of the message