Date: Wed, 16 Sep 2015 08:51:53 +0100 From: Bob Bishop <rb@gid.co.uk> To: freebsd-hardware@freebsd.org Cc: Andriy Gapon <avg@freebsd.org>, freebsd-hackers@freebsd.org, Dieter BSD <dieterbsd@gmail.com>, Konstantin Belousov <kostikbel@gmail.com> Subject: Re: ECC support Message-ID: <93871ADA-EDA3-481C-9959-1D371AB44479@gid.co.uk> In-Reply-To: <20150916035904.GE67105@kib.kiev.ua> References: <CAA3ZYrBXZn1WpHWYGJYWJDPsk7iDahCas8RhnHC4w%2Babf4w4hA@mail.gmail.com> <55F88A18.6090504@FreeBSD.org> <20150916035904.GE67105@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, Arriving late to this thread, a few observations: - Obviously the more RAM you have, the more errors you are going to see. = In other words, ECC makes increasing sense as RAM sizes get larger. All = server-class hardware should have it. - DRAM has to be refreshed. In sensible designs, ECC scrub is integrated = with refresh to minimise overhead. It doesn=E2=80=99t have to be very = frequent, maybe every 24 hours. - On server-class hardware, the platform management (BMC or whatever) = should be picking up, logging, and possibly alarming on ECC errors = regardless of the OS. - You might think that as memory density increases (ie bit cell size = shrinks), error rates would increase. Apparently this wasn=E2=80=99t so = up to 2009 at least, see: http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf which reports on a study of these issues across Google=E2=80=99s estate = at the time. I don=E2=80=99t know of any more recent similar work. -- Bob Bishop rb@gid.co.uk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?93871ADA-EDA3-481C-9959-1D371AB44479>