FreeBSD Mail Archives

Date:      Wed, 16 Sep 2015 08:51:53 +0100
From:      Bob Bishop <rb@gid.co.uk>
To:        freebsd-hardware@freebsd.org
Cc:        Andriy Gapon <avg@freebsd.org>, freebsd-hackers@freebsd.org, Dieter BSD <dieterbsd@gmail.com>, Konstantin Belousov <kostikbel@gmail.com>
Subject:   Re: ECC support
Message-ID:  <93871ADA-EDA3-481C-9959-1D371AB44479@gid.co.uk>
In-Reply-To: <20150916035904.GE67105@kib.kiev.ua>
References:  <CAA3ZYrBXZn1WpHWYGJYWJDPsk7iDahCas8RhnHC4w%2Babf4w4hA@mail.gmail.com> <55F88A18.6090504@FreeBSD.org> <20150916035904.GE67105@kib.kiev.ua>

index | next in thread | previous in thread | raw e-mail


Hi,

Arriving late to this thread, a few observations:

- Obviously the more RAM you have, the more errors you are going to see. In other words, ECC makes increasing sense as RAM sizes get larger. All server-class hardware should have it.

- DRAM has to be refreshed. In sensible designs, ECC scrub is integrated with refresh to minimise overhead. It doesn’t have to be very frequent, maybe every 24 hours.

- On server-class hardware, the platform management (BMC or whatever) should be picking up, logging, and possibly alarming on ECC errors regardless of the OS.

- You might think that as memory density increases (ie bit cell size shrinks), error rates would increase. Apparently this wasn’t so up to 2009 at least, see:

 http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf

which reports on a study of these issues across Google’s estate at the time. I don’t know of any more recent similar work.

--
Bob Bishop
rb@gid.co.uk

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?93871ADA-EDA3-481C-9959-1D371AB44479>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation