Date: Tue, 27 Mar 2007 23:02:54 +0100 (BST) From: Gavin Atkinson <gavin.atkinson@ury.york.ac.uk> To: =?ISO-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net> Cc: stable@freebsd.org Subject: Re: Weird messages output Message-ID: <20070327224208.E64587@ury.york.ac.uk> In-Reply-To: <2B277B33-A56F-4E77-9E57-7F4777B22D2F@anduin.net> References: <1D425A90-3619-48B7-8171-13ECC9A31087@anduin.net> <1175002427.44767.41.camel@buffy.york.ac.uk> <2B277B33-A56F-4E77-9E57-7F4777B22D2F@anduin.net>
next in thread | previous in thread | raw e-mail | index | archive | help
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-86407754-1175032974=:64587 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Tue, 27 Mar 2007, Eirik =D8verby wrote: > On 27. mar. 2007, at 15.33, Gavin Atkinson wrote: >> On Tue, 2007-03-27 at 15:00 +0200, Eirik =D8verby wrote: >>> Hi all, >>>=20 >>> running 6.1-RELEASE on several HP DL385 servers (identically >>> configured), one of them has recently spat the following out in the / >>> var/log/messages file: >>>=20 >>> .......... >>> Mar 10 03:51:24 apphost02 ntpd[445]: kernel time sync enabled 2001 >>> Mar 10 05:02:01 apphost02 kernel: NMI ISA 30, EISA ff >>> .......... >>=20 >> I suspect you'll find your (ECC) memory has problems. > > You are absolutely correct. Further investigation using the ProLiant=20 > management tools for FreeBSD revealed serious RAM trouble. Two banks were= =20 > degraded, so we have now had the modules replaced on-site. Glad to be of help! > Thanks for the tip! > Do you happen to know if there are any "generic" tools/daemons available = to=20 > decipher such NMIs? Perhaps be able to send SNMP traps or something? I don't, to be honest. There is some code in /usr/src/sys/i386/isa/nmi.c= =20 that tries to detect the cause of an NMI, although I don't remember ever=20 seeing the messages when a parity error was detected. I guess it's=20 possible that (to some chipset vendor at least) 0x20 and 0x30 indicate=20 parity error, but neither our code or Linux's (see=20 http://fxr.watson.org/fxr/source/arch/i386/kernel/traps.c?v=3Dlinux-2.6#L74= 3 ) know those codes to mean parity error. Gavin --0-86407754-1175032974=:64587--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070327224208.E64587>