Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Sep 2001 10:06:14 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Peter Wemm <peter@wemm.org>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: ecc on i386 
Message-ID:  <15280.36694.786500.622681@grasshopper.cs.duke.edu>
In-Reply-To: <20010925012041.CC9613808@overcee.netplex.com.au>
References:  <15279.54029.454089.299807@grasshopper.cs.duke.edu> <20010925012041.CC9613808@overcee.netplex.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help

Peter Wemm writes:


Thanks for your description of how ECC is reported on PCs.  That was
very, very helpful.

 > The Tyan Thunder 2510 BIOS even disables ECC -> NMI routing so you have to
 > go to quite a bit of trouble to reprogram the serverworks chipset to
 > actually generate NMI's so that you can find out if something got trashed.

Is that the He-Sl or the LE-3 chipset?  Is that code available?
I have some LE-3 based boxes which I'd like be certain DTRT.

Unlike my wife's Dual Athlon, these boxes have nothing in their
BIOS pertaining to ECC error reporting. (Supermicro 370-DLE)

 > Our NMI / ECC handling really really sucks in FreeBSD. Consider:
 > - i686_pagezero - reads before writing in order to minimize cache snooping
 > traffic in SMP systems.  However, if it gets an NMI while trying to check
 > if the cache line is already zero, it will take the entire machine down
 > instead of just zeroing the line.
 > - NFS / VM / bio:  when they get an NMI while trying to copy data that is
 > clean and backed by storage, they take the machine down instead of trying
 > to recover and re-read the page.
 > - userland.. If userland gets an NMI, the machine dies instead of killing
 > the process (or rereading a text page etc if possible)
 > - our NMI handlers are a festering pile of excretement.  They dont have
 > the code to 'ack' the NMI so it isn't possible to return after recovery.
 > - and so on.

Well, at least we take the machine down, which is a heck of a lot
better than ignoring the problem, which is really all that I was
hoping for. 

Thanks again,

Drew

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15280.36694.786500.622681>