From owner-svn-src-head@FreeBSD.ORG Thu Jul 24 10:14:52 2014 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8052D895; Thu, 24 Jul 2014 10:14:52 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 617E82E78; Thu, 24 Jul 2014 10:14:52 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.8/8.14.8) with ESMTP id s6OAEq7I048268; Thu, 24 Jul 2014 10:14:52 GMT (envelope-from marius@svn.freebsd.org) Received: (from marius@localhost) by svn.freebsd.org (8.14.8/8.14.8/Submit) id s6OAEqKE048267; Thu, 24 Jul 2014 10:14:52 GMT (envelope-from marius@svn.freebsd.org) Message-Id: <201407241014.s6OAEqKE048267@svn.freebsd.org> From: Marius Strobl Date: Thu, 24 Jul 2014 10:14:52 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r269052 - head/sys/x86/x86 X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jul 2014 10:14:52 -0000 Author: marius Date: Thu Jul 24 10:14:51 2014 New Revision: 269052 URL: http://svnweb.freebsd.org/changeset/base/269052 Log: Intel desktop Haswell CPUs may report benign corrected parity errors (see HSD131 erratum in [1]) at a considerable rate. So filter these (default), unless logging is enabled. Unfortunately, there really is no better way to reasonably implement suppressing these errors than to just skipping them in mca_log(). Given that they are reported for bank 0, they'd need to be masked in MSR_MC0_CTL. However, P6 family processors require that register to be set to either all 0s or all 1s, disabling way more than the one error in question when using all 0s there. Alternatively, it could be masked for the corresponding CMCI, but that still wouldn't keep the periodic scanner from detecting these spurious errors. Apart from that, register contents of MSR_MC0_CTL{,2} don't seem to be publicly documented, neither in the Intel Architectures Developer's Manual nor in the Haswell datasheets. Note that while HSD131 actually is only about C0-stepping as of revision 014 of the Intel desktop 4th generation processor family specification update, these corrected errors also have been observed with D0-stepping aka "Haswell Refresh". 1: http://www.intel.de/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-desktop-specification-update.pdf Reviewed by: jhb MFC after: 3 days Sponsored by: Bally Wulff Games & Entertainment GmbH Modified: head/sys/x86/x86/mca.c Modified: head/sys/x86/x86/mca.c ============================================================================== --- head/sys/x86/x86/mca.c Thu Jul 24 10:12:22 2014 (r269051) +++ head/sys/x86/x86/mca.c Thu Jul 24 10:14:51 2014 (r269052) @@ -99,6 +99,10 @@ static int amd10h_L1TP = 1; SYSCTL_INT(_hw_mca, OID_AUTO, amd10h_L1TP, CTLFLAG_RDTUN, &amd10h_L1TP, 0, "Administrative toggle for logging of level one TLB parity (L1TP) errors"); +static int intel6h_HSD131; +SYSCTL_INT(_hw_mca, OID_AUTO, intel6h_HSD131, CTLFLAG_RDTUN, &intel6h_HSD131, 0, + "Administrative toggle for logging of spurious corrected errors"); + int workaround_erratum383; SYSCTL_INT(_hw_mca, OID_AUTO, erratum383, CTLFLAG_RD, &workaround_erratum383, 0, "Is the workaround for Erratum 383 on AMD Family 10h processors enabled?"); @@ -242,12 +246,34 @@ mca_error_mmtype(uint16_t mca_error) return ("???"); } +static int __nonnull(1) +mca_mute(const struct mca_record *rec) +{ + + /* + * Skip spurious corrected parity errors generated by desktop Haswell + * (see HSD131 erratum) unless reporting is enabled. + * Note that these errors also have been observed with DO-stepping, + * while the revision 014 desktop Haswell specification update only + * talks about CO-stepping. + */ + if (rec->mr_cpu_vendor_id == CPU_VENDOR_INTEL && + rec->mr_cpu_id == 0x306c3 && rec->mr_bank == 0 && + rec->mr_status == 0x90000040000f0005 && !intel6h_HSD131) + return (1); + + return (0); +} + /* Dump details about a single machine check. */ static void __nonnull(1) mca_log(const struct mca_record *rec) { uint16_t mca_error; + if (mca_mute(rec)) + return; + printf("MCA: Bank %d, Status 0x%016llx\n", rec->mr_bank, (long long)rec->mr_status); printf("MCA: Global Cap 0x%016llx, Status 0x%016llx\n",