From owner-freebsd-questions@FreeBSD.ORG Mon Dec 20 22:43:24 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4B11A16A4CE for ; Mon, 20 Dec 2004 22:43:24 +0000 (GMT) Received: from lariat.org (lariat.org [63.229.157.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id AD8B843D45 for ; Mon, 20 Dec 2004 22:43:23 +0000 (GMT) (envelope-from brett@lariat.org) Received: from runaround.lariat.org (IDENT:ppp1000.lariat.org@lariat.org [63.229.157.2]) by lariat.org (8.9.3/8.9.3) with ESMTP id PAA09543; Mon, 20 Dec 2004 15:43:18 -0700 (MST) X-message-flag: Warning! Use of Microsoft Outlook renders your system susceptible to Internet worms. Message-Id: <6.2.0.14.2.20041220153733.061c6830@localhost> X-Mailer: QUALCOMM Windows Eudora Version 6.2.0.14 Date: Mon, 20 Dec 2004 15:43:17 -0700 To: Charles Swiger From: Brett Glass In-Reply-To: <137C9E12-52D6-11D9-9340-003065ABFD92@mac.com> References: <6.2.0.14.2.20041220135549.05fdaa88@localhost> <137C9E12-52D6-11D9-9340-003065ABFD92@mac.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" cc: questions@freebsd.org Subject: Re: ECC status in FreeBSD X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Dec 2004 22:43:24 -0000 At 03:25 PM 12/20/2004, Charles Swiger wrote: >However, your RAM isn't a hard drive, so the ad-sector remapping used >by hard drives is not fully applicable. Your machine is expected not >to have any part of memory fail reproducably, but if you do, it's time >to use the warranty and replace the entire chip. It's true that RAM is not a hard drive. However, if the problem is with certain memory cells rather than, say, the row or column drivers, the rest of the chip is usable. And if you did want to scuttle the entire module on which the chip resided, you'd probably want to disable that module in the meantime by telling the system not to use it. Certainly, you'd at least want to know which module was failing. There's nothing to tell you that right now. >ECC is a fine idea, but the motherboard chipset pretty much does >everything that is required (except for the reporting/syslogging), so >the kernel doesn't need to be specially involved for the system to >benefit from ECC protection. Alas, right now there's no way to KNOW that you need to deal with a failing RAM module until you start experiencing random and possibly destructive system panics or crashes. It'd be nice, at least, to see something in the logs or be able to collect statistics from the motherboard. --Brett