Date: Wed, 11 Aug 2010 13:38:44 -0400 From: John Baldwin <jhb@FreeBSD.org> To: Dan Langille <dan@langille.org> Cc: Andrew Heybey <ath@niksun.com>, freebsd-hackers@freebsd.org Subject: Re: 8.1-STABLE amd64 machine check Message-ID: <4C62E024.2090702@FreeBSD.org> In-Reply-To: <4C627FB1.5060007@langille.org> References: <4C627FB1.5060007@langille.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Dan Langille wrote: > I am encountering a situation similar to one reported by Andrew Heybey > at http://docs.freebsd.org/cgi/mid.cgi?6E83197B-9DD5-4C7E-846D-AD176C25464D > > This morning I found this in my /var/log/messages: > > Aug 11 01:59:48 kraken kernel: MCA: Bank 4, Status 0x94614c62001c011b > Aug 11 01:59:48 kraken kernel: MCA: Global Cap 0x0000000000000106, > Status 0x0000000000000000 > Aug 11 01:59:48 kraken kernel: MCA: Vendor "AuthenticAMD", ID 0x100f42, > APIC ID 0 > Aug 11 01:59:48 kraken kernel: MCA: CPU 0 COR GCACHE LG RD error > Aug 11 01:59:48 kraken kernel: MCA: Address 0x5d0fe8c > > > from /var/run/dmesg.boot > > Copyright (c) 1992-2010 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 8.1-STABLE #0: Sun Jul 25 19:18:56 EDT 2010 > dan@kraken.example.org:/usr/obj/usr/src/sys/KRAKEN amd64 > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: AMD Phenom(tm) II X4 945 Processor (3010.17-MHz K8-class CPU) > Origin = "AuthenticAMD" Id = 0x100f42 Family = 10 Model = 4 > Stepping = 2 > > Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > > Features2=0x802009<SSE3,MON,CX16,POPCNT> > AMD > Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!> > AMD > Features2=0x37ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT> > > TSC: P-state invariant > real memory = 4294967296 (4096 MB) > avail memory = 4100710400 (3910 MB) > ACPI APIC Table: <111909 APIC1708> > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > FreeBSD/SMP: 1 package(s) x 4 core(s) > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > cpu2 (AP): APIC ID: 2 > cpu3 (AP): APIC ID: 3 > > > Andrew: You posted about this on July 14. Anything new since then? > > John: Is it time for me to get a new CPU? Hmm, this is what mcelog says: HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 0 4 northbridge ADDR 5d0fe8c Northbridge NB Array Error bit33 = err cpu1 bit42 = L3 subcache in error bit 0 bit43 = L3 subcache in error bit 1 bit46 = corrected ecc error memory/cache error 'generic read mem transaction, generic transaction, level generic' STATUS 94614c62001c011b MCGSTATUS 0 MCGCAP 106 APICID 0 SOCKETID 0 CPUID Vendor AMD Family 16 Model 4 It was a corrected ECC error. If you get more than one then perhaps the CPU is busted, but if you only get one, an isolated bit flip may not be worth worrying about. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C62E024.2090702>