Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Mar 2006 11:46:35 -0500 (EST)
From:      "Yuri Lukin" <freebsd@swaggi.com>
To:        freebsd-hardware@freebsd.org
Subject:   Re: FreeBSD shutting down unexpectedly
Message-ID:  <1143132394.16617@swaggi.com>
In-Reply-To: <200603180107.16268.soralx@cydem.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.

--bound1143132394
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 7bit

soralx@cydem.org wrote ..
> 
> However, if you measure the temperature with the case open and CPU idle,
> and cooler's performance is same or better than assumed, you'd better
> not rely on this processor. In fact, 55*C is somewhat too high in any
> case, considering that there exists additional heat dissipation path
> through mainboard.

I checked mbmon as well as BIOS and both reported a drop in at least 10 degrees
with the case open and CPU running idle. 
 
> I'd check the thermal interface btw CPU and cooler first. Is the heatsink
> sitting level on the core? Is there a nice thin layer of clean thermal
> compound between them? Fan turning at good RPM? Then I'd check Vdd with
> a scope (or at least a DMM). Is it at the right level and clean? At this
> point I would think twice before replacing the CPU. Overheating it could
> have created some kind permanent latchups (shorts from Vdd to Vss directly),
> which would result in higher power consumption, but this isn't likely,
> plus
> you'd definitely see some instability or erros in CPU operation. So
> I personally don't think that CPU damaged by overheating can consume
> more power, but be stable, and then suddenly die some day; correct
> me if I'm wrong. 

When I ordered a replacement fan, I also ordered replacement heatsinks
(this is a dual-cpu motherboard). So I discarded old heatsinks and installed
new fan/heatsink combo's and also applied a drop of Arctic Silver to each cpu
(after cleaning off the old thermal grease with isopropyl alcohol). 
 
> It is not very likely that you CPU was damaged by overheating too. It might
> not have been stable when overheated (no kidding!), but I belive the
> mainboard should power it down before it reaches temperature at which
> permanent damage results.

Agreed, I believe this is exactly what was happening with cpu1 when the fan seized. 
Unfortunately for me, I did not have SMP compiled into the kernel so the system
would just shut off. 

I am still however a bit confused as to what mbmon is outputting for me. This
is what I am currently seeing:

Temp.= 30.8, 28.6, 22.0; Rot.= 5818, 5113,    0
Vcore = 1.50, 1.50; Volt. = 3.35, 3.27,  7.93,   0.00,  0.00

I am assuming that 30.8 is the Tcpu of cpu1. But which one is the Tcpu of cpu2?
Here's the chipset mbmon is using to probe the values:

su-2.05b# mbmon -D
Probe Request: none
>>> Testing Reg's at VIA686 HWM <<<
Probing VIA686A/B chip:
  CR40:0x01,  CR41:0xD0,  CR42:0x9C,  CR43:0xFF
  CR44:0xFF,  CR47:0xF0,  CR49:0x7D,  CR4B:0x40
  CR3F:0xA2,  CR14:0x7E,  CR1F:0x7F,  CR20:0x93
  CR21:0x8E,  CR22:0x79,  CR23:0x79,  CR24:0xCE
  CR25:0x7F,  CR26:0x7F,  CR29:0x1D,  CR2B:0xFF
Using VIA686 HWM directly!!
* VIA Chip VT82C686A/B found.

I read the doc for mbmon but still couldn't really understand it. Do I need
to recompile the kernel with SMP in order for mbmon to read the values from
the second CPU? I didn't think that would be necessary. By the way, before anyone
asks, I do plan to compile SMP in the near future to utilize the second processor. 

Thanks.
-Yuri


--bound1143132394--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1143132394.16617>