From owner-freebsd-hackers Fri Sep 28 7:21:26 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id 5335D37B408 for ; Fri, 28 Sep 2001 07:21:23 -0700 (PDT) Received: from mail.cicely.de (cicely20 [10.1.1.22]) by srv1.cosmo-project.de (8.11.0/8.11.0) with ESMTP id f8SELF618091; Fri, 28 Sep 2001 16:21:17 +0200 (CEST) Received: (from ticso@localhost) by mail.cicely.de (8.11.0/8.11.0) id f8SELre12194; Fri, 28 Sep 2001 16:21:53 +0200 (CEST) Date: Fri, 28 Sep 2001 16:21:53 +0200 From: Bernd Walter To: Bsdguru@aol.com Cc: peter@wemm.org, hackers@FreeBSD.ORG Subject: Re: ecc on i386 Message-ID: <20010928162153.E11634@cicely20.cicely.de> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from Bsdguru@aol.com on Fri, Sep 28, 2001 at 09:46:19AM -0400 X-Operating-System: NetBSD cicely20.cicely.de 1.5 sparc Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, Sep 28, 2001 at 09:46:19AM -0400, Bsdguru@aol.com wrote: > In a message dated 9/25/01 1:05:21 PM Eastern Daylight Time, peter@wemm.org > writes: > > > > Well, at least we take the machine down, which is a heck of a lot > > > better than ignoring the problem, which is really all that I was > > > hoping for. > > I dont think this is "good". Back in the XT days we used to get a false > parity error every once on a while on an ISA card...taking the machine down > on a bit error (which XTs used to do) was completly wrong and unnecessary. If Haeh - if your memory content has been changed behind you can only hope that it doesn't trashed some important metadata and won't trash the whole system. Well it's much better if you check the use of the memory region and do some inteligent handling. But ignoring is definately a very dangerous thing. I never understood why computers are build wihtout at least parity. DRAMs have a so called soft-error-rate and may toggle a bit no matter how good the memory is - it's only changing how likely that is. Thus DRAM usage implies using ECC if you don't want any surprises. > you are using the box as a router, you dont want the machine to do down > because of a memory error, or in this case, a non-error. It should certainly A memory corruption is not a "non-error". > be optional. If you are running a R/O or flash system there is no harm in > keeping the machine running if possible. Even in an R/O case it can trust corrupted data and may even distribute. Broken Hardware needs to be exchanged. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message