From owner-freebsd-stable@FreeBSD.ORG Mon May 16 16:23:24 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 580761065670; Mon, 16 May 2011 16:23:24 +0000 (UTC) (envelope-from jhay@meraka.csir.co.za) Received: from zibbi.meraka.csir.co.za (zibbi.meraka.csir.co.za [IPv6:2001:4200:7000:2::1]) by mx1.freebsd.org (Postfix) with ESMTP id AF56F8FC1F; Mon, 16 May 2011 16:23:22 +0000 (UTC) Received: by zibbi.meraka.csir.co.za (Postfix, from userid 3973) id B073D3982C; Mon, 16 May 2011 18:23:19 +0200 (SAST) Date: Mon, 16 May 2011 18:23:19 +0200 From: John Hay To: alc@freebsd.org Message-ID: <20110516162319.GA58581@zibbi.meraka.csir.co.za> References: <20110510125220.GA88338@zibbi.meraka.csir.co.za> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org Subject: Re: MCA: CPU 0 UNCOR PCC DTLB L1 error X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 May 2011 16:23:24 -0000 On Wed, May 11, 2011 at 05:26:50PM -0500, Alan Cox wrote: > On Tue, May 10, 2011 at 7:52 AM, John Hay wrote: > > > Hi, > > > > I have seen this panic a few times on a Gigabyte E350N-USB3 running > > 8-STABLE. > > I have only seen it while in X, but then the machine is always in X. At > > first, > > I just got these hangs, so bought a PCI-express RS232 card and could see > > these > > at last. For some reason it does not go past this, so I have not been able > > to > > get a dump yet. > > > > Have anybody an idea of why this is or how to debug it further? I searched > > the archives and found something similar about a year ago, but it looks > > like it was solved with a fix that got committed. > > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=140338 > > > > I have now disabled mca in loader.conf with 'hw.mca.enabled="0"' and I have > > not seen that panic again. I do occasionally see a panic in devfs_open(), > > but I guess that should be handled in another thread. > > > > The kernel is basically a GENERIC kernel with puc uncommented and the > > following in loader.conf > > > > vm.kmem_size="12G" > > hw.mca.enabled="0" > > zfs_load="YES" > > ahci_load="YES" > > xhci_load="YES" > > amdtemp_load="YES" > > ng_ubt_load="YES" > > uplcom_load="YES" > > > > Here is the panic message and after that dmesg. > > > > John > > -- > > John Hay -- jhay@meraka.csir.co.za / jhay@FreeBSD.org > > > > #################################################### > > MCA: Bank 0, Status 0xb600000000010015 > > MCA: Global Cap 0x0000000000000106, Status 0x0000000000000004 > > MCA: Vendor "AuthenticAMD", ID 0x500f10, APIC ID 0 > > MCA: CPU 0 UNCOR PCC DTLB L1 error > > MCA: Address 0x8016c4000 > > > > > > Fatal trap 28: machine check trap while in user mode > > cpuid = 0; apic id = 00 > > instruction pointer = 0x43:0x80156af85 > > stack pointer = 0x3b:0x7fffffffcb18 > > frame pointer = 0x3b:0x80fe87800 > > code segment = base 0x0, limit 0xfffff, type 0x1b > > = DPL 3, pres 1, long 1, def32 0, gran 1 > > processor eflags = interrupt enabled, IOPL = 0 > > current process = 2484 (initial thread) > > trap number = 28 > > panic: machine check trap > > cpuid = 0 > > KDB: stack backtrace: > > #0 0xffffffff80608d5e at kdb_backtrace+0x5e > > #1 0xffffffff805d6707 at panic+0x187 > > #2 0xffffffff808bf4c0 at trap_fatal+0x290 > > #3 0xffffffff808bfaa9 at trap+0x109 > > #4 0xffffffff808a7d94 at calltrap+0x8 > > #################################################### > > > > > Please try the following patch: > > Index: x86/x86/mca.c > =================================================================== > --- x86/x86/mca.c (revision 219060) > +++ x86/x86/mca.c (working copy) > @@ -665,7 +665,8 @@ mca_setup(uint64_t mcg_cap) > * for Erratum 383. > */ > if (cpu_vendor_id == CPU_VENDOR_AMD && > - CPUID_TO_FAMILY(cpu_id) == 0x10 && amd10h_L1TP) > + (CPUID_TO_FAMILY(cpu_id) == 0x10 || > + CPUID_TO_FAMILY(cpu_id) == 0x14) && amd10h_L1TP) > workaround_erratum383 = 1; > > mtx_init(&mca_lock, "mca", NULL, MTX_SPIN); > Index: i386/i386/pmap.c > =================================================================== > --- i386/i386/pmap.c (revision 219060) > +++ i386/i386/pmap.c (working copy) > @@ -758,7 +758,8 @@ pmap_init(void) > * machine monitor. > */ > if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && > - CPUID_TO_FAMILY(cpu_id) == 0x10) > + (CPUID_TO_FAMILY(cpu_id) == 0x10 || > + CPUID_TO_FAMILY(cpu_id) == 0x14)) > workaround_erratum383 = 1; > > /* > Index: amd64/amd64/pmap.c > =================================================================== > --- amd64/amd64/pmap.c (revision 219060) > +++ amd64/amd64/pmap.c (working copy) > @@ -727,7 +727,8 @@ pmap_init(void) > * machine monitor. > */ > if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD && > - CPUID_TO_FAMILY(cpu_id) == 0x10) > + (CPUID_TO_FAMILY(cpu_id) == 0x10 || > + CPUID_TO_FAMILY(cpu_id) == 0x14)) > workaround_erratum383 = 1; > > /* I have applied the patch, but got another one today. I still do not get a prompt or dump. :-( It just get stuck right after #4. If there is anything more that I can try, just ask. ##################################################################### MCA: Bank 0, Status 0xb600000000010015 MCA: Global Cap 0x0000000000000106, Status 0x0000000000000004 MCA: Vendor "AuthenticAMD", ID 0x500f10, APIC ID 0 MCA: CPU 0 UNCOR PCC DTLB L1 error MCA: Address 0x808ace000 Fatal trap 28: machine check trap while in user mode cpuid = 1; apic id = 01 instruction pointer = 0x43:0x80af206d5 stack pointer = 0x3b:0x7fffffffb8e8 frame pointer = 0x3b:0x809b92450 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 3, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, IOPL = 0 current process = 22228 (initial thread) trap number = 28 panic: machine check trap cpuid = 1 KDB: stack backtrace: #0 0xffffffff80608f6e at kdb_backtrace+0x5e #1 0xffffffff805d6917 at panic+0x187 #2 0xffffffff808bf7c0 at trap_fatal+0x290 #3 0xffffffff808bfda9 at trap+0x109 #4 0xffffffff808a8084 at calltrap+0x8 ##################################################################### John -- John Hay -- jhay@meraka.csir.co.za / jhay@FreeBSD.org