From owner-freebsd-hardware@freebsd.org  Wed Sep 16 07:52:07 2015
Return-Path: <owner-freebsd-hardware@freebsd.org>
Delivered-To: freebsd-hardware@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C90A19C270F;
 Wed, 16 Sep 2015 07:52:07 +0000 (UTC) (envelope-from rb@gid.co.uk)
Received: from mx0.gid.co.uk (mx0.gid.co.uk [194.32.164.250])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 748D71E14;
 Wed, 16 Sep 2015 07:52:07 +0000 (UTC) (envelope-from rb@gid.co.uk)
Received: from [194.32.164.24] (80-46-130-69.static.dsl.as9105.com
 [80.46.130.69])
 by mx0.gid.co.uk (8.14.2/8.14.2) with ESMTP id t8G7pw6u085964;
 Wed, 16 Sep 2015 08:51:58 +0100 (BST) (envelope-from rb@gid.co.uk)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\))
Subject: Re: ECC support
From: Bob Bishop <rb@gid.co.uk>
In-Reply-To: <20150916035904.GE67105@kib.kiev.ua>
Date: Wed, 16 Sep 2015 08:51:53 +0100
Cc: Andriy Gapon <avg@freebsd.org>, freebsd-hackers@freebsd.org,
 Dieter BSD <dieterbsd@gmail.com>, Konstantin Belousov <kostikbel@gmail.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <93871ADA-EDA3-481C-9959-1D371AB44479@gid.co.uk>
References: <CAA3ZYrBXZn1WpHWYGJYWJDPsk7iDahCas8RhnHC4w+abf4w4hA@mail.gmail.com>
 <55F88A18.6090504@FreeBSD.org> <20150916035904.GE67105@kib.kiev.ua>
To: freebsd-hardware@freebsd.org
X-Mailer: Apple Mail (2.2104)
X-BeenThere: freebsd-hardware@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General discussion of FreeBSD hardware <freebsd-hardware.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hardware>, 
 <mailto:freebsd-hardware-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hardware/>
List-Post: <mailto:freebsd-hardware@freebsd.org>
List-Help: <mailto:freebsd-hardware-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hardware>, 
 <mailto:freebsd-hardware-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Sep 2015 07:52:07 -0000

Hi,

Arriving late to this thread, a few observations:

- Obviously the more RAM you have, the more errors you are going to see. =
In other words, ECC makes increasing sense as RAM sizes get larger. All =
server-class hardware should have it.

- DRAM has to be refreshed. In sensible designs, ECC scrub is integrated =
with refresh to minimise overhead. It doesn=E2=80=99t have to be very =
frequent, maybe every 24 hours.

- On server-class hardware, the platform management (BMC or whatever) =
should be picking up, logging, and possibly alarming on ECC errors =
regardless of the OS.

- You might think that as memory density increases (ie bit cell size =
shrinks), error rates would increase. Apparently this wasn=E2=80=99t so =
up to 2009 at least, see:

 http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf

which reports on a study of these issues across Google=E2=80=99s estate =
at the time. I don=E2=80=99t know of any more recent similar work.

--
Bob Bishop
rb@gid.co.uk