Date: Fri, 15 Oct 2004 14:56:21 -0700 (PDT) From: Don Lewis <truckman@FreeBSD.org> To: bob@boulderlabs.com Cc: freebsd-current@FreeBSD.org Subject: Re: 5.3-BETA7 install cd: kernel trap 12 with interrupts disabled (fwd) Message-ID: <200410152156.i9FLuLkf082072@gw.catspoiler.org> In-Reply-To: <200410151741.i9FHfMxQ036620@vec.boulderlabs.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 15 Oct, Robert Gray wrote: > > > Yes, a built-in memory tester would be helpful. However, my experience > is that many memory problems pass the x86memtest, but cause > "seg faults" during buildworld, partly due to the extra load and heat > from disk activity. > > I'm a firm believer that we should encourage our users to > buy ECC systems - motherboards that support ECC, and the > more expensive SIMMs/DIMMs that have redundant bits. So do I, but we don't have any support for reporting ECC errors. Hardware ECC support will paper over defective memory that has bad bits, but it won't be reliable. Frequent correctable ECC errors are a good indication that there is a hardware problem that needs to be fixed. Blindly turning ECC on will make hardware problems harder to detect and fix. I have one motherboard/memory combo (ECC on both) that sets the memory timing incorrectly (the memory is rated CL 2.5, but the BIOS configures it as CL 2 when it is configured to set the timing automaticallly). I was seeing files occasionally get corrupted when the were cached in RAM (/usr/src and /usr/obj would get hit), and some of the longer running tests in memtest86 would detect the problem. The problem went away when I manually set the memory timing to the correct value.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200410152156.i9FLuLkf082072>