From owner-freebsd-stable@FreeBSD.ORG Sun Jul 11 23:19:24 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E1AC016A4CE for ; Sun, 11 Jul 2004 23:19:24 +0000 (GMT) Received: from mailhost.xciv.org (vantage.xciv.org [217.158.13.13]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6283C43D46 for ; Sun, 11 Jul 2004 23:19:24 +0000 (GMT) (envelope-from paul@xciv.org) Received: from localhost ([127.0.0.1] helo=xciv.org) by mailhost.xciv.org with esmtp id 1BjnbM-000Evy-00; Mon, 12 Jul 2004 00:19:20 +0100 X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: freebsd-stable@freebsd.org Organization: XCIV, London UK In-reply-to: Your message of "Sun, 11 Jul 2004 15:34:52 PDT." <20040711153233.D76940@carver.gumbysoft.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 12 Jul 2004 00:19:20 +0100 Message-ID: <57409.1089587960@xciv.org> From: Paul Civati X-XCIV-MailScanner: Found to be clean Subject: Re: SMP and NMI errors (4.10) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: paul@xciv.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Jul 2004 23:19:25 -0000 Doug White wrote: > NMIs sometimes get triggered for ECC corrections, which is why a > memtester wouldn't see it. Doh, yeah. > I think there is a kernel option somewhere that hooks > NMI and attempts to get information from the platform as to what DIMM > triggered it. Can't see anything for that, and there is only one DIMM, I have a second one to go in that I can swap to test. > Otherwise you might check the Event Log in the BIOS for ECC events. Alas no BIOS event log for this mobo. > > If I boot a uniprocessor kernel this problem doesn't occur. > It might be temperature related then :) I was hoping no-one would say that :) This is a 1U rack mount and currently seems to run at about ~40 deg. C on CPU1, ~30 deg. C on CPU2 and ~33 deg. C system temp. First CPU is a little hot perhaps but I don't think too high to be a problem? -Paul-