Date: Fri, 9 Jul 2010 16:03:31 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-stable@freebsd.org Cc: Markus Gebert <markus.gebert@hostpoint.ch> Subject: Re: 8.1-RC2 - PCI fatal error or MCE triggered by USB/ehci on Sun X4100M2? Message-ID: <201007091603.31843.jhb@freebsd.org> In-Reply-To: <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch> References: <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch>
next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, July 09, 2010 11:26:00 am Markus Gebert wrote: > -- > MCA: Bank 4, Status 0xb400004000030c2b > MCA: Global Cap 0x0000000000000105, Status 0x0000000000000007 > MCA: Vendor "AuthenticAMD", ID 0x40f13, APIC ID 2 > MCA: CPU 2 UNCOR BUSLG Observer WR I/O > MCA: Address 0xfd00000000 Using my local port of mcelog this is what I get for this check: CPU 2 4 northbridge ADDR fd00000000 Northbridge Master abort link number = 4 bit61 = error uncorrected bus error 'local node observed, request didn't time out generic write mem transaction i/o access, level generic' STATUS b400004000030c2b MCGSTATUS 7 MCGCAP 105 APICID 2 SOCKETID 0 CPUID Vendor AMD Family 15 Model 65 I don't know what to tell you off hand. Did you buy this hardware from Sun directly? If so, I would try bugging them about this, especially given the error that the BIOS is logging. It does sound like a hardware issue, but in the chipset, not in the RAM, so you might need to swap out the main board rather than the RAM. I'm curious if disabling USB legacy support in the BIOS causes it to still die even with ehci not loaded. If so, then the SMI# for the ehci controller must somehow prevent the issue, perhaps by triggering frequently enough to slow the rate of I/O requests down? -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201007091603.31843.jhb>