Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Aug 1997 23:37:07 -0600 (MDT)
From:      Kenneth Merry <ken@plutotech.com>
To:        asami@cs.berkeley.edu (Satoshi Asami)
Cc:        hardware@FreeBSD.ORG
Subject:   Re: parity errors
Message-ID:  <199708190537.XAA11729@pluto.plutotech.com>
In-Reply-To: <199708182317.QAA11642@vader.cs.berkeley.edu> from Satoshi Asami at "Aug 18, 97 04:17:53 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
Satoshi Asami wrote...
> What am I supposed to see when there is a parity error on the main
> memory?  I have a few memory modules I suspect to be bad, so I put
> them (256MB total) in our package building machine and tried a "make
> world", and got one "kernel page fault" (or something like that) and
> two lockups (no message on console).  I disabled parity check in the
> BIOS, and world aborted once with a sh seg-faulting and once with
> a syntax error from make.
> 
> At this point, I think it is pretty clear that the memory's at fault,
> but shouldn't I see some "NMI" type messages?  (If I grepped
> correctly, it should be the "NMI indicates hardware failure" at line
> 265 in /sys/i386/i386/trap.c.)
> 
> This is with a P6-200 (not overclocked) on an Intel Venus motherboard.

	I agree, you should see some sort of error.  I had some ram trouble
on one of my machines (ASUS P/I-XP6NP5 MB), and I got a message that
specifically said "ram parity error".  It must have been from
/sys/i386/isa/intr_machdep.c.  (I grepped for "parity error")

	Of course I had parity memory in there, and I either had parity or
ECC checking turned on..

	I'm not so sure you'd get any NMI messages unless you have
parity checking turned on.  If that doesn't work, try turning on ECC
support.

	I would put the SIMMs in the machine two at a time and swap them
around until you isolate the bad SIMMs.  Of course that'll take a while, I
imagine, with a make world test.  One test that I used that worked
sometimes was to crank up a ton of xv processes with big pictures -- enough
to eat up all the ram and some of the swap.   That usually had the effect
of crashing the machine with a "RAM parity error".  Of course you'd want to
display the xv processes on a remote machine so you catch the panic
message.

Hope this helps,

Ken
-- 
Kenneth Merry
ken@plutotech.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708190537.XAA11729>