Date: Thu, 7 Feb 2008 23:06:16 +0100 From: Bernd Walter <ticso@cicely12.cicely.de> To: Dieter <freebsd@sopwith.solgatos.com> Cc: freebsd-alpha@freebsd.org Subject: Re: DS10L - "processor correctable error" Message-ID: <20080207220616.GJ24583@cicely12.cicely.de> In-Reply-To: <200802071752.RAA13888@sopwith.solgatos.com> References: <20080207162120.GG24583@cicely12.cicely.de> <200802071752.RAA13888@sopwith.solgatos.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Feb 07, 2008 at 09:52:54AM +0000, Dieter wrote: > > > > > "Warning: received processor correctable error." > > > > > This is an ECC memory correction. > > > > Can I know which DIMM (DS10L has 2 DIMMs) is faulty? > > > > The message appears approx. once every other pass. > > > The address is always the same. > > If you decide to replace one, you could pick one at random and > see if the error goes away. Since your error is repeatable > you can run the test to see if you guessed right. Since there > are only 2 DIMMs you have a 50% chance of guessing right the 1st > time and 100% the 2nd time. > > > Alphas are using the memory in pairs and can correct multiple faulty > > bits in a single dataword. > > Really? I've always assumed that it was the standard single error > correction double error detection. It uses at least 128bit words - some alphas even uses 256 bit words. so there are 16 or even 32 bits for ECC and this allows correcting more than just a single bit. (some?) Alphas use a multi stage correction mechanism. The first stage is done in hardware and if the hardware fails to handle it, it is done in software by a palcode handler. Maybe even some non alpha systems do multi bit correction, since a modern i386/amd64 has at least 8bit for ECC, but I only know it for sure with alphas. -- B.Walter http://www.bwct.de http://www.fizon.de bernd@bwct.de info@bwct.de support@fizon.de
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080207220616.GJ24583>