Date: Tue, 18 Jan 2011 17:02:30 -0500 From: "Michael Jung" <mikej@paymentallianceintl.com> To: "John Baldwin" <jhb@freebsd.org>, <freebsd-current@freebsd.org> Subject: Re: unknown mtx_assert at /usr/src/sys/x86/x86/io_apic.c:161 Message-ID: <C95B7826.2B40F%mikej@paymentallianceintl.com> In-Reply-To: <C95668AF.2769D%mikej@paymentallianceintl.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/14/11 8:55 PM, "Michael Jung" <mikej@paymentallianceintl.com> = wrote: > John: >=20 > Thanks, I actually didn=B9t see the MCA errors on the screen as the = system has > reloaded but noted them in the ddb.txt file last night. >=20 > The Motherboard, CPU, Memory and PS were replaced today. I=B9ll post = back if > this has or not corrected the problem but I suspect you are on target = in > that the hardware was defective. This machine was remote and I found = the > fan in the power supply not working, so I=B9m suspecting that the CPU = was or > other logic was damaged. >=20 > Thanks for your reply. >=20 > --mikej >=20 >=20 > On 1/14/11 4:13 PM, "John Baldwin" <jhb@freebsd.org> wrote: >=20 >> > On Thursday, January 13, 2011 11:26:46 am Michael Jung wrote: >>>> >> > Links to crash info below. >>>> >> > http://216.26.153.6/msgbuf.txt >> > >> > This might be a hardware problem. The panic you got is a "should = never >> > happen" panic. Note that in the code line sourced, the second = argument to >> > mtx_assert() is MA_OWNED. The panic is saying that it is some = invalid >> value >> > (i.e. something other than MA_OWNED). Given that is a constant, = that's not >> > very likely at all barring some hardware glitch. >> > >> > You do have a somewhat scary looking machine check logged before = your >> panic: >> > >> > MCA: Bank 1, Status 0xd000000000000171 >> > MCA: Global Cap 0x0000000000000105, Status 0x0000000000000000 >> > MCA: Vendor "AuthenticAMD", ID 0x20fc2, APIC ID 0 >> > MCA: CPU 0 COR OVER ICACHE L1 EVICT error >> > >> > It is a correctable error, but given the nature of the panic I'd = suspect a >> > hardware problem. >> > >> > mcelog doesn't provide many more details: >> > >> > HARDWARE ERROR. This is *NOT* a software problem! >> > Please contact your hardware vendor >> > CPU 0 1 instruction cache >> > bit62 =3D error overflow (multiple errors) >> > memory/cache error 'evict mem transaction, instruction = transaction, level >> 1' >> > STATUS d000000000000171 MCGSTATUS 0 >> > MCGCAP 105 APICID 0 SOCKETID 0 >> > CPUID Vendor AMD Family 15 Model 44 >> > >> > -- >> > John Baldwin >> > >=20 > The box has run fine since hardware was replaced. Thanks for you = help. >=20 > ---mikej CONFIDENTIALITY NOTE: This message is intended only for the use of the individual or entity to whom it is addressed and may contain=20 information that is privileged, confidential, and exempt from=20 disclosure under applicable law. If the reader of this message is=20 not the intended recipient, you are hereby notified that any=20 dissemination, distribution or copying of this communication=20 is strictly prohibited. If you have received this transmission=20 in error, please notify us by telephone at (502) 212-4001 or=20 notify us at PAI , Dept. 99, 11857 Commonwealth Drive,=20 Louisville, KY 40299. Thank you.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C95B7826.2B40F%mikej>
