From owner-freebsd-current@FreeBSD.ORG Fri Apr 23 13:50:52 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD8EF106564A for ; Fri, 23 Apr 2010 13:50:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 906988FC0A for ; Fri, 23 Apr 2010 13:50:52 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 4461F46B98; Fri, 23 Apr 2010 09:50:52 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 57B138A021; Fri, 23 Apr 2010 09:50:51 -0400 (EDT) From: John Baldwin To: freebsd-current@freebsd.org Date: Fri, 23 Apr 2010 09:48:28 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <20100422222834.GA93197@troutmask.apl.washington.edu> In-Reply-To: <20100422222834.GA93197@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201004230948.29096.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Fri, 23 Apr 2010 09:50:51 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Steve Kargl Subject: Re: MCA messages in /var/log/message? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2010 13:50:52 -0000 On Thursday 22 April 2010 6:28:34 pm Steve Kargl wrote: > How does one interpret the following MCA message? > > MCA: Bank 4, Status 0x945a4000d6080a13 > MCA: Global Cap 0x0000000000000105, Status 0x0000000000000000 > MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0 > MCA: CPU 0 COR BUSLG Responder RD Memory > MCA: Address 0x70c42280 > MCA: Bank 4, Status 0x942140012a080813 > MCA: Global Cap 0x0000000000000105, Status 0x0000000000000000 > MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 1 > MCA: CPU 1 COR BUSLG Source RD Memory > MCA: Address 0x1b97ca578 > > It appears that these messages coincide with a 15 to 30 > second period where my USB mouse inexplicably loses a > large number of button clicks, (which is quite noticable > with firefox3). If you have access to p4, you can download a patched version of mcelog from //depot/projects/mcelog/... (have to use 'make FREEBSD=yes') which will parse these for you. Hmm, I ran it and here is what it said: HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 0 4 northbridge ADDR 70c42280 Northbridge RAM Chipkill ECC error Chipkill ECC syndrome = d6b4 bit46 = corrected ecc error bus error 'local node response, request didn't time out generic read mem transaction memory access, level generic' STATUS 945a4000d6080a13 MCGSTATUS 0 MCGCAP 105 APICID 0 SOCKETID 0 CPUID Vendor AMD Family 15 Model 5 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 1 4 northbridge ADDR 1b97ca578 Northbridge RAM Chipkill ECC error Chipkill ECC syndrome = 2a42 bit32 = err cpu0 bit46 = corrected ecc error bus error 'local node origin, request didn't time out generic read mem transaction memory access, level generic' STATUS 942140012a080813 MCGSTATUS 0 MCGCAP 105 APICID 1 SOCKETID 0 CPUID Vendor AMD Family 15 Model 5 Note that they are corrected errors, so the RAM may not actually be bad, it just may be transient failures. -- John Baldwin