Date: Mon, 18 May 2009 11:41:26 -0400 From: John Baldwin <jhb@freebsd.org> To: Alan Cox <alc@cs.rice.edu> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r192050 - in head/sys: amd64/amd64 amd64/include conf i386/i386 i386/include Message-ID: <200905181141.27355.jhb@freebsd.org> In-Reply-To: <4A0F085D.6000202@cs.rice.edu> References: <200905131753.n4DHr4YL063065@svn.freebsd.org> <4A0F085D.6000202@cs.rice.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday 16 May 2009 2:39:25 pm Alan Cox wrote: > John Baldwin wrote: > > Author: jhb > > Date: Wed May 13 17:53:04 2009 > > New Revision: 192050 > > URL: http://svn.freebsd.org/changeset/base/192050 > > > > Log: > > Implement simple machine check support for amd64 and i386. > > - For CPUs that only support MCE (the machine check exception) but not MCA > > (i.e. Pentium), all this does is print out the value of the machine check > > registers and then panic when a machine check exception occurs. > > - For CPUs that support MCA (the machine check architecture), the support is > > a bit more involved. > > - First, there is limited support for decoding the CPU-independent MCA > > error codes in the kernel, and the kernel uses this to output a short > > description of any machine check events that occur. > > - When a machine check exception occurs, all of the MCx banks on the > > current CPU are scanned and any events are reported to the console > > before panic'ing. > > - To catch events for correctable errors, a periodic timer kicks off a > > task which scans the MCx banks on all CPUs. The frequency of these > > checks is controlled via the "hw.mca.interval" sysctl. > > - Userland can request an immediate scan of the MCx banks by writing > > a non-zero value to "hw.mca.force_scan". > > - If any correctable events are encountered, the appropriate details > > are stored in a 'struct mca_record' (defined in <machine/mca.h>). > > The "hw.mca.count" is a count of such records and each record may > > be queried via the "hw.mca.records" tree by specifying the record > > index (0 .. count - 1) as the next name in the MIB similar to using > > PIDs with the kern.proc.* sysctls. The idea is to export machine > > check events to userland for more detailed processing. > > - The periodic timer and hw.mca sysctls are only present if the CPU > > supports MCA. > > > > Discussed with: emaste (briefly) > > MFC after: 1 month > > > > Added: > > head/sys/amd64/amd64/mca.c (contents, props changed) > > head/sys/amd64/include/mca.h (contents, props changed) > > head/sys/i386/i386/mca.c (contents, props changed) > > head/sys/i386/include/mca.h (contents, props changed) > > Modified: > > head/sys/amd64/amd64/machdep.c > > head/sys/amd64/amd64/mp_machdep.c > > head/sys/amd64/amd64/trap.c > > head/sys/amd64/include/specialreg.h > > head/sys/conf/files.amd64 > > head/sys/conf/files.i386 > > head/sys/i386/i386/machdep.c > > head/sys/i386/i386/mp_machdep.c > > head/sys/i386/i386/trap.c > > head/sys/i386/include/specialreg.h > > > > After this change my Phenom II locks up hard within minutes of booting. > There are no messages, and I am unable to break into the debugger from a > serial console. > > The same exact kernel is running fine on a Core 2 Quad. I will probably add a tunable to enable machine checks and disable them by default then. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200905181141.27355.jhb>