From owner-freebsd-questions Tue May 25 4:56:39 1999 Delivered-To: freebsd-questions@freebsd.org Received: from www.inx.de (www.inx.de [195.21.255.251]) by hub.freebsd.org (Postfix) with ESMTP id 4F51514D8E for ; Tue, 25 May 1999 04:56:35 -0700 (PDT) (envelope-from jnickelsen@acm.org) Received: from n33-71.berlin.snafu.de ([195.21.33.71] helo=goting.jn.berlin.snafu.de) by www.inx.de with esmtp (Exim 2.12 #2) id 10mFow-00015r-00; Tue, 25 May 1999 13:56:35 +0200 Received: from ockholm.jn.berlin.snafu.de (ockholm.jn.berlin.snafu.de [10.0.0.3]) by goting.jn.berlin.snafu.de (Postfix) with ESMTP id BA26613D; Tue, 25 May 1999 12:29:20 +0200 (CEST) Date: Tue, 25 May 1999 12:29:29 +0200 From: Juergen Nickelsen To: Alex Heiphetz Cc: freebsd-questions@FreeBSD.ORG Subject: Re: 100% dependability/failsafe/security/hardware Message-ID: <388916.3136624169@ockholm.jn.berlin.snafu.de> In-Reply-To: <3.0.6.32.19990524185242.009583e0@cvzoom.net> Originator-Info: login-id=nickel; server=goting.jn.berlin.snafu.de X-Mailer: Mulberry (MacOS) [1.4.2.1, s/n U-301240] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG --On Mon, 24. Mai 1999 18:52 -0400 Alex Heiphetz wrote: > 3. How to provide 100% failsafe system? *All* hardware redundant: CPUs, RAM, secondary storage, data paths, power supplies, fans, UPSs, etc.; proactive hardware monitoring facilities (including CPU results, see below) with hot failover and automatic notification of field service in case of problems; the ability to replace all parts (except the case, perhaps, but probably including bus backplanes) while the system is running; and an operating system that manages all that. There are a handful vendors making such machines with 100% guaranteed reliability, and these machines do *not* come cheap. I once saw a Stratus Continuum system at the german weather service (DWD), which was (and still is) the regional telecommunications hub in central Europe for the Global Telecommunications System of the World Meteorological Organization. You could pull out a CPU module and plug it back in while the machine was running, but the guy who demonstrated dared to pull out only a fan module. Still, the machine noticed and the modem dialled to notify tech support. Once, the guy told us, there was a hard disk in the mail, and nobody knew why, because it hadn't been ordered (and the DWD is a *big* organization). Finally they found out that the machine itself had ordered the disk, because it had noticed an increase of soft errors on one of its disks. The CPUs of this machine are packs of four (two pairs). One of the pairs is active, the other in hot standby, all four running synchronuosly the same instructions. All signals of the two CPUs of a pair are compared by hardware comparators, and if there is a difference, the pair is taken out of service; if it was the active one, the other takes over, without, literally, missing a beat. Well, this is what you need for 100% reliability. Be prepared to pay a lot for it. And, sorry to say, but you won't run FreeBSD on such a system. Greetings, Juergen. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message