From owner-freebsd-hardware  Tue May 21 13:35:59 1996
Return-Path: owner-hardware
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id NAA27277
          for hardware-outgoing; Tue, 21 May 1996 13:35:59 -0700 (PDT)
Received: from persprog.com (persprog.com [204.215.255.203])
          by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id NAA27267
          for <hardware@freebsd.org>; Tue, 21 May 1996 13:35:52 -0700 (PDT)
Received:  by persprog.com (8.7.5/4.10)
	id PAA24820; Tue, 21 May 1996 15:25:45 -0500
Received: from novell(192.2.2.201) by cerberus.ppi.com via smap (V1.3)
	id smab24809; Tue May 21 16:25:21 1996
Received: from NOVELL/SpoolDir by novell.persprog.com (Mercury 1.12);
    Tue, 21 May 96 16:21:40 +0500
Received: from SpoolDir by NOVELL (Mercury 1.12); Mon, 20 May 96 13:27:22 +0500
From: "David Alderman" <dave@persprog.com>
Organization: Personalized Programming, Inc.
To: Joe Greco <jgreco@brasil.moneng.mei.com>
Date: Mon, 20 May 1996 13:27:21 EST
Subject: Re: Triton chipset with 256k cache caches 32M only?
CC: hardware@freebsd.org
Priority: normal
X-mailer: Pegasus Mail for Windows (v2.31)
Message-ID: <1E75A1F6F@novell.persprog.com>
Sender: owner-hardware@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> From:          Joe Greco <jgreco@brasil.moneng.mei.com>
Joe Greco said (with establishing info deleted):

> Agreed, however, it's much more likely that some other component (think:
> disks, cpu fans, etc) will exhibit errors.  15% is a fairly hefty price to
> pay for a relatively small return.
> 
> On a well built RAID system that needs the extra reliability, perhaps it is
> warranted.  It is probably _not_ warranted on your average run of the mill
> server class system to lose 15% just to gain one correction every ten years
> rather than one crash every ten years, unless you have purposely over-spec'd
> the machine to account for the 15% loss.
> 
> Obviously it is a matter of how paranoid (or silly?) you want to be..  I am
> perfectly confident that my disks will puke before my RAM.
> 
> ... Joe
> 

You may be right about ECC.  The loss of parity in the Triton I 
should not be underestimated because of the potential for disaster at 
the time the machine is first put into service.

A few years back, we put a Unix box into service which had DTC 
caching disk controller in it.  Although the controller did a RAM 
test on power up, it did not even make use of the parity bit on the 
RAM that was installed. It took a few weeks of service before someone 
discovered the little bit errors showing up in their data.  Since the 
backup was on the same controller, you can guess how reliable they 
were.  It took quite a while to restore the data integrity on this 
system, even with the original backups that were uncorrupted by the 
controller (a lot of data can be generated in a few weeks on a 
server!).  I wish the computer had crashed!  It would have saved a 
lot of time and effort.  

I think Intel did a real disservice by ever producing a chipset that 
did not check parity.  I know that Triton chipset motherboards are 
being used as servers,  and that even a rigorous burnin may not 
reveal some forms of memory failure that occur in a 32 bit operating 
system under the stresses of real usage.  At least with parity, the 
machine did go down and the administrator was alerted to the problem 
but nothing is worse than gradual data corruption.

Maybe Intel is overcompensating a bit by giving us the option of ECC, 
but anyone who was burned in the Triton I era might want that option.
Will I use it? I don't know, but I am glad it is there.
======================================
When philosophy conflicts with reality, choose reality.
Dave Alderman  -- dave@persprog.com
======================================