From owner-freebsd-hardware@FreeBSD.ORG Thu Mar 23 16:42:26 2006 Return-Path: X-Original-To: freebsd-hardware@freebsd.org Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6B2F816A420 for ; Thu, 23 Mar 2006 16:42:26 +0000 (UTC) (envelope-from freebsd@swaggi.com) Received: from swaggi.com (c-24-131-179-91.hsd1.ma.comcast.net [24.131.179.91]) by mx1.FreeBSD.org (Postfix) with ESMTP id DCD2243D46 for ; Thu, 23 Mar 2006 16:42:23 +0000 (GMT) (envelope-from freebsd@swaggi.com) Received: from localhost ([127.0.0.1] helo=swaggi.com) by swaggi.com with esmtp (Exim 4.60 (FreeBSD)) (envelope-from ) id 1FMSxH-0004K6-7X for freebsd-hardware@freebsd.org; Thu, 23 Mar 2006 11:46:35 -0500 Received: (from freebsd@localhost) by swaggi.com (8.13.4/8.13.4/Submit) id k2NGkZdG016620; Thu, 23 Mar 2006 11:46:35 -0500 (EST) (envelope-from freebsd@swaggi.com) Date: Thu, 23 Mar 2006 11:46:35 -0500 (EST) From: "Yuri Lukin" To: freebsd-hardware@freebsd.org Cc: X-Originating-IP: 192.168.1.9 X-Mailer: Usermin 1.190 In-Reply-To: <200603180107.16268.soralx@cydem.org> Message-Id: <1143132394.16617@swaggi.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="bound1143132394" Subject: Re: FreeBSD shutting down unexpectedly X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Mar 2006 16:42:26 -0000 This is a multi-part message in MIME format. --bound1143132394 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 7bit soralx@cydem.org wrote .. > > However, if you measure the temperature with the case open and CPU idle, > and cooler's performance is same or better than assumed, you'd better > not rely on this processor. In fact, 55*C is somewhat too high in any > case, considering that there exists additional heat dissipation path > through mainboard. I checked mbmon as well as BIOS and both reported a drop in at least 10 degrees with the case open and CPU running idle. > I'd check the thermal interface btw CPU and cooler first. Is the heatsink > sitting level on the core? Is there a nice thin layer of clean thermal > compound between them? Fan turning at good RPM? Then I'd check Vdd with > a scope (or at least a DMM). Is it at the right level and clean? At this > point I would think twice before replacing the CPU. Overheating it could > have created some kind permanent latchups (shorts from Vdd to Vss directly), > which would result in higher power consumption, but this isn't likely, > plus > you'd definitely see some instability or erros in CPU operation. So > I personally don't think that CPU damaged by overheating can consume > more power, but be stable, and then suddenly die some day; correct > me if I'm wrong. When I ordered a replacement fan, I also ordered replacement heatsinks (this is a dual-cpu motherboard). So I discarded old heatsinks and installed new fan/heatsink combo's and also applied a drop of Arctic Silver to each cpu (after cleaning off the old thermal grease with isopropyl alcohol). > It is not very likely that you CPU was damaged by overheating too. It might > not have been stable when overheated (no kidding!), but I belive the > mainboard should power it down before it reaches temperature at which > permanent damage results. Agreed, I believe this is exactly what was happening with cpu1 when the fan seized. Unfortunately for me, I did not have SMP compiled into the kernel so the system would just shut off. I am still however a bit confused as to what mbmon is outputting for me. This is what I am currently seeing: Temp.= 30.8, 28.6, 22.0; Rot.= 5818, 5113, 0 Vcore = 1.50, 1.50; Volt. = 3.35, 3.27, 7.93, 0.00, 0.00 I am assuming that 30.8 is the Tcpu of cpu1. But which one is the Tcpu of cpu2? Here's the chipset mbmon is using to probe the values: su-2.05b# mbmon -D Probe Request: none >>> Testing Reg's at VIA686 HWM <<< Probing VIA686A/B chip: CR40:0x01, CR41:0xD0, CR42:0x9C, CR43:0xFF CR44:0xFF, CR47:0xF0, CR49:0x7D, CR4B:0x40 CR3F:0xA2, CR14:0x7E, CR1F:0x7F, CR20:0x93 CR21:0x8E, CR22:0x79, CR23:0x79, CR24:0xCE CR25:0x7F, CR26:0x7F, CR29:0x1D, CR2B:0xFF Using VIA686 HWM directly!! * VIA Chip VT82C686A/B found. I read the doc for mbmon but still couldn't really understand it. Do I need to recompile the kernel with SMP in order for mbmon to read the values from the second CPU? I didn't think that would be necessary. By the way, before anyone asks, I do plan to compile SMP in the near future to utilize the second processor. Thanks. -Yuri --bound1143132394--