Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Apr 2006 16:02:31 -0400 (EDT)
From:      brad miele <bmiele@ipnstock.com>
To:        freebsd-proliant@freebsd.org
Subject:   memory errors
Message-ID:  <20060428155252.E879@shanty.ipnstock.com>

next in thread | raw e-mail | index | archive | help
Hi,

We have been dealing with phantom reboots on one of our DL380s for a few 
months. After finally getting hpasmd running, I started seeing the 
following in /var/log/messages just prior to the reboots:

Apr 26 11:38:59 bwayipn02 hpasmd[669]: WARNING: hpasmd: Corrected Memory 
Error threshold exceeded (System Memory, Memory Module 2)
Apr 26 11:39:00 bwayipn02 kernel: pid 669 (hpasmd), uid 0: exited on 
signal 11 (core dumped)

In reviewing the management log in ILO, the errors did correspond with the 
phantom reboots. HP sent new ram and we replaced it. The server was up 
under miinimal load for two days, and then today, the same thing happened. 
same error.

HP is sending a new board and more new ram, and we are going to try that.

My question is, is there anything on the os and software level that could 
cause this behavior? or is it most likely bad hardware? I am concerned 
because I noted an instance of the same error in the ilo logs of our other 
dl380, although on module 1, and i thought that the odds of both having 
bad ram/boards might be slim.

Thanks,

Brad
---------------------
Brad Miele
bmiele@ipnstock.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060428155252.E879>