Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 09 Apr 2012 12:04:27 +0200
From:      "O. Hartmann" <ohartman@zedat.fu-berlin.de>
To:        Miroslav Lachman <000.fbsd@quip.cz>
Cc:        Nikolay Denev <ndenev@gmail.com>, freebsd-performance@freebsd.org, Current FreeBSD <freebsd-current@freebsd.org>
Subject:   Re: ECC memory driver in FreeBSD 10?
Message-ID:  <4F82B42B.1050900@zedat.fu-berlin.de>
In-Reply-To: <4F818A3B.5040904@quip.cz>
References:  <4F7ED7F4.5060509@zedat.fu-berlin.de> <687BFFD7-1456-4D7B-AFB2-356EE9B0D1DD@gmail.com> <4F818A3B.5040904@quip.cz>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
Am 04/08/12 14:53, schrieb Miroslav Lachman:
> Nikolay Denev wrote:
>> On Apr 6, 2012, at 2:48 PM, O. Hartmann wrote:
>>
>>> I'm looking for a way to force FreeBSD 10 to maintain/watch ECC errors
>>> reported by UEFI (or BIOS).
>>> Since ECC is said to be essential for server systems both in buisness
>>> and science and I do not question this, I was wondering if I can not
>>> report ECC errors via a watchdog or UEFI (ACPI?) report to syslog
>>> facility on FreeBSD.
>>> FreeBSD is supposed to be a server operating system, as far as I know,
>>> so I believe there must be something which didn't have revealed itself
>>> to me, yet.
> 
>>
>> If the hardware supports it, such errors should be logged as MCEs
>> (Machine Check Exceptions).
>> I can say for sure it works pretty well with Dell servers, as I had 
>> one with failing RAM module, and
>> it reported the corrected ECC errors in dmesg.
> 
> Memory ECC errors are logged in to messages and you can decode it by
> sysutils/mcelog. I did it in the past on one of our Sun Fire X2100 M2
> with FreeBSD 8.x.
> 
> Miroslav Lachman

Seems that I have been blessed with non-faulty memory over tha past
three or four years. Last time I saw errors was around 2000. All of our
24/7 servers do have ECC RAM.

So, your replies all implies if I log the system's messages via syslog
properly (as we do remotely on a centralized server), then ECC errors
should be reported by FreeBSD/kernel in a canonical way as the UEFI/BIOS
reports them?
Without special drivers/tools, scripts which scans for those errors
should report occurences?

Since my (FreeBSD) boxes didn't show up errors of that kind - Linux
boxes of a colleague did once! - doesn't imply missing capabilities.
This is nice to hear/read.

Thanks a lot,

Oliver


[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQEcBAEBAgAGBQJPgrQrAAoJEOgBcD7A/5N8cF8H/1nYRUFkgGBpmOaMyS5ED1ij
7wqM4s0OiCsW7bzFxTj3/C3dushNefBcesdTSDmU/I8nks0197J8PPy7PSldqffB
OvlpxxNKEJwO+kp8+iO3oAdu0QNKK8pLhoAaDeXPq8N/e0M2DpcjE6j2rnC0td/l
sppKb9cKZKEoWBZ/3dc5DjyzO3oVxTrnxSwIFolF7EINHkADb80ka8vtjOHSqXIP
M0CkQZA+hJPL+iHRK1Ab5Kw4Wq6/7tljPlo560U/nr9gW7XoPGH0lTXzcjGOMVuX
FepjT6D7r1kf+k0zrmi/AyJy6NuLEqKXWprmEoYTXQZBqv6NdM+zcHImiwJNvds=
=UYjv
-----END PGP SIGNATURE-----

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F82B42B.1050900>