Date: Fri, 28 Aug 2009 21:34:31 +0200 From: Willem Jan Withagen <wjw@withagen.nl> To: hardware@freebsd.org Subject: Re: enable ECC in OS code? Message-ID: <4A983147.8080906@withagen.nl> In-Reply-To: <864ort1lw0.fsf@ds4.des.no> References: <200908262253.n7QMrauP063683@wattres.watt.com> <200908271130.18073.erich@apsara.com.sg> <20090827112229.GB14987@britannica.bec.de> <864ort1lw0.fsf@ds4.des.no>
next in thread | previous in thread | raw e-mail | index | archive | help
Dag-Erling Smørgrav wrote: > Joerg Sonnenberger <joerg@britannica.bec.de> writes: >> Erich Dollansky <erich@apsara.com.sg> writes: >>> how should it be done at OS level at all when the OS is loaded >>> into RAM? >> Copy the kernel to the video RAM, jump to it, enable ECC, copy back. > > Not just the kernel - you have to copy all the memory that is currently > in use, including interrupt tables, the BIOS configuration space, shadow > copies of various ROMs... The CPU will probably not look too kindly on > having interrupt descriptors, segment descriptors, page tables etc. in > memory accessed through the I/O controller instead of the memory > controller. > > The machine might not even have video RAM! > > On systems that support ECC, I suspect that the BIOS enables it at the > same time as it configures the memory controller, which is one of the > very first things it does - literally within a few dozen (or perhaps a > few hundred) instructions from CPU reset - using only CPU registers, ROM > code, and configuration variables loaded from NVRAM. The way we did it when we were building Unix Systems in the 80's is that the output of the EEC checker on the memoryboards was gated to an High Prio Intr of an Open collectorline to the m68k. And that OC had a gate that was only enabled once we were sure that the whole memory had been test written. Remember that usually the memory in extention systems don't have EEC either. This holds for I/O registers, video memory, memory on networkcontrollers etc, unless we build the devices outselves... These memories also connects to the gate to allow access without triggering an EEC error. I have not ever looked into the HW of a current ECC controller but I expect there is a big chance that EEC parity always gets written. Not doing so will require extra gates in an already too long path from the XOR devices to the ECC cells (Thus reducing the writecycle time). ECC errors will only be flagged if EEC is enabled,but very likely EEC is always tested and/or written. So I would expect it to be "relatively simpele" read/write all the physical memory then enable the ECC intr/trap mechanism. --WjW
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A983147.8080906>