From owner-freebsd-hackers Mon Sep 24 18:14:27 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by hub.freebsd.org (Postfix) with ESMTP id 2CD8237B409 for ; Mon, 24 Sep 2001 18:14:19 -0700 (PDT) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.3/8.9.3) with ESMTP id VAA18072; Mon, 24 Sep 2001 21:14:08 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.3/8.9.1) id f8P1DgW71389; Mon, 24 Sep 2001 21:13:42 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15279.55878.110154.650940@grasshopper.cs.duke.edu> Date: Mon, 24 Sep 2001 21:13:42 -0400 (EDT) To: Matt Dillon Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: ecc on i386 In-Reply-To: <200109250058.f8P0wx998146@earth.backplane.com> References: <15279.54029.454089.299807@grasshopper.cs.duke.edu> <200109250058.f8P0wx998146@earth.backplane.com> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matt Dillon writes: > > :What happens on an ECC equipped PC when you have a multi-bit memory > :error that hardware scrubbing can't fix? Will there be some sort of > :NMI or something that will panic the box? > : > :I'm used to alphas (where you'll get a fatal machine check panic) and > :I am just wondering if PCs are as safe. > : > :Thanks, > : > :Drew > > ECC can typically detect and correct single bit errors and detect > double bit errors. Anything beyond that is problematic... it may or > may not detect the problem or may mis-correct a multi-bit error. > An NMI is generated if an uncorrectable error is detected. > > On PC's, ECC is optional. Desktops typically do not ship with ECC > memory. Branded servers typically do. A year or two ago I would > have been happy to use non-ECC rams (finding bad RAM through trial > and error), but now with capacities as they are and memory prices down > ECC is definitely the way to go. My sentiments exactly. > Bit errors can come from many sources, memory being only one. Bit errors > can occur inside the cpu chip, in the L1 and L2 caches, in memory, in > controller chips... all over the place. Many modern processors implement > parity on their caches to try to cover the problem areas. I'm not sure > how Pentium III's and IV's are setup. > > -Matt Hmm.. Well, it turns out that the box I"m insterested in (Thunder K7) can be set to send an SERR on multiple bit errors. I wonder what happens when a pc gets an SERR? (that's another machine check on alpha) Drew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message