From owner-freebsd-current@FreeBSD.ORG Tue Jun 26 06:55:22 2007 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4EB4516A400 for ; Tue, 26 Jun 2007 06:55:22 +0000 (UTC) (envelope-from ed@hoeg.nl) Received: from palm.hoeg.nl (mx0.hoeg.nl [83.98.131.211]) by mx1.freebsd.org (Postfix) with ESMTP id 0FB0913C45B for ; Tue, 26 Jun 2007 06:55:21 +0000 (UTC) (envelope-from ed@hoeg.nl) Received: by palm.hoeg.nl (Postfix, from userid 1000) id 4C39F1CC29; Tue, 26 Jun 2007 08:55:20 +0200 (CEST) Date: Tue, 26 Jun 2007 08:55:20 +0200 From: Ed Schouten To: Suleiman Souhlal Message-ID: <20070626065520.GQ27942@hoeg.nl> References: <46806B3E.2060701@FreeBSD.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="pDMuYlNcjisHFiH4" Content-Disposition: inline In-Reply-To: <46806B3E.2060701@FreeBSD.org> User-Agent: Mutt/1.5.15 (2007-04-06) Cc: current@freebsd.org Subject: Re: [PATCH] Machine Check Architecture on amd64 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jun 2007 06:55:22 -0000 --pDMuYlNcjisHFiH4 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable * Suleiman Souhlal wrote: > Hi, >=20 > I have a simple patch for amd64 that uses the Machine Check=20 > Architecture/Exceptions on most recent x86 CPUs to detect memory errors: >=20 > http://people.freebsd.org/~ssouhlal/testing/mce-20070621.diff >=20 > It will report uncorrected and corrected errors (the latter, only if sys= ctl=20 > machdep.mce.log_corrected=3D1). > You can ask the kernel to panic if it gets an uncorrected error by setti= ng=20 > machdep.mce.panic_on_uc=3D1. > All this can be disabled by setting the machdep.mce.enable tunable to 0.= I'm=20 > still not sure if I want this enabled by default, as I don't have any In= tel=20 > machines to test this on, but I have tested it on Opteron (both correcte= d=20 > and uncorrected errors). >=20 > I would appreciate it if someone would try this, especially if you have= =20 > Intel machines with bad RAM. >=20 > Comments are welcome. | /* | * Uncorrected MCEs will generate a #MC, while corrected | * don't, so we have to periodically poll for them. | */ What about adding an option to only print uncorrected MCE's? That's the most interesting data and we can get that without using a kthread, right? Nice work! :-) --=20 Ed Schouten WWW: http://g-rave.nl/ --pDMuYlNcjisHFiH4 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFGgLhY52SDGA2eCwURAh74AJ9s3HAH9RQJX3FI3eacfjiwdXCw8QCePqaD VTMzInO7WHRiA3uPHRyMchY= =GHGK -----END PGP SIGNATURE----- --pDMuYlNcjisHFiH4--