Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Mar 2001 00:22:48 +0200 (CEST)
From:      "Hartmann, O." <ohartman@klima.physik.uni-mainz.de>
To:        "Dan H." <danh@nofx.eagle.ca>
Cc:        <freebsd-hardware@FreeBSD.ORG>, <freebsd-questions@FreeBSD.ORG>
Subject:   Re: overheated PIII/KATMAI in SMP system
Message-ID:  <Pine.BSF.4.33.0103280018070.591-100000@klima.physik.uni-mainz.de>
In-Reply-To: <Pine.BSF.4.21.0103271707250.4231-100000@nofx.eagle.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 27 Mar 2001, Dan H. wrote:

Hello.

The BIOS ist the newest BIOS I got (1012B). There is no newer BIOS.
I know that the temperature sensors on main PCBs are very "sloppy",
but it is simply a kind of indicator for me why our server has permanently
SIG 11 errors when compiling a world (FreeBSD world). Assuming that the
error on each sensor is linear, it reflects that one CPU produces much
more heat in SMP mode than the others; this is the output from now,
UP-kernel, compiling world:
i
healthd:

************************
* Hardware Information *
************************
Asus: AS97127F
************************

Temp.= 21.0, 32.5, 34.0; Rot.=    0, 4963, 4753
Vcore = 2.06, 2.05; Volt. = 3.20, 4.95, 11.98, -11.66, -5.09
Temp.= 21.0, 32.5, 34.0; Rot.=    0, 5192, 4963
Vcore = 2.08, 2.05; Volt. = 3.22, 4.95, 12.04, -11.66, -5.09


heat:

ystem Temperature       69F (21.0C)
CPU1 Temperature         91F (33.0C)
CPU2 Temperature         94F (34.5C)
FAN2                    310 RPMs (functional)
FAN3                    310 RPMs (functional)

This problem occured several days ago ... it came suddenly ...



:>Greetings,
:>
:>Have you ever flashed the BIOS on the Asus motherboard?
:>
:>I know there was a bug in a Asus motherboards that gave incorrect (or at
:>least very far off the correct temp.) CPU temperatures due to sloppy
:>code in the BIOS, and bad placement of the termperature sensor.
:>
:>So, are you using the BIOS temperature readings? If so, check the Asus
:>site for BIOS updates, that may fix the above.
:>
:>I doubt it's the CPU, especially if it's the motherboard that you are
:>getting the temperature readings from.
:>
:>Good luck!
:>
:>
:>--Dan
:>
:>
:>
:>
:>
:>On Wed, 28 Mar 2001, Hartmann, O. wrote:
:>
:>>
:>> Dear Sirs.
:>>
:>> I need a little bit help and some technical hints.
:>>
:>> One of our SMP servers, a FreeBSD 4.-RC box, is not willing
:>> to compile a world. It faults with SIG 11.
:>>
:>> The problem is caused by a overheated Intel Pentium III/600MHz
:>> with KATMAI core.
:>>
:>> The main PCB is a ASUS P2B-D. Rear CPU slot is CPU 0. the inner
:>> slot is for CPU 1. Today I swapped both CPUs and equipted both with
:>> expensive fans and coolers. The case has a very, very good air
:>> circulation, the rear is ventilated by two additional 80mm collers.
:>>
:>> Secanrio: CPU 1, in the last configuration the inner one, has a
:>> temperature of about 32 -38 degrees Celsius. The outer one, CPU 0
:>> has 50 degress and up!! Yesterday I took measuremnets with swapped
:>> CPUs, and figured out, that CPU 1, the inner one has 50 degrees and
:>> the outer one, CPU 0 has only 30 to 40 degress.
:>>
:>> I switched kernel from SMP to UP and ran the system while compiling
:>> a world. Both CPUs were not over 35 degress Celsius!
:>>
:>> Then I compiled a SMP kernel again and tryed to start make world.
:>> CPU 0, the outer one, has now constantly about 50 degress Celsius.
:>>
:>> Well, due the fact of changing fan and coller elements by better ones
:>> one fan is very close to the next SECC-2 case of CPU 1 (the inner one).
:>> This handicaped fan is for CPU 0, the overheated one. But yesterday
:>> exactly this CPU has the place of the "cooler" CPU, the inner one,
:>> so I think not that this could be a real heatsink problem.
:>>
:>> I think the problem has to be targeted either by the mainboard (maybe
:>> some kind of weakness in voltage regulation? But why only in SMP
:>> mode of the kernel and not in UP mode?). Or the CPU has some faults.
:>>
:>> I switched again to UP kernel to see, whether temeprature is decreasing
:>> or not.
:>>
:>> this is the actual output of "healthd -d -I" and "heat" as it reflects the
:>> configuration at this moment:
:>>
:>>
:>> HEAT:
:>> System Temperature       69F (21.0C)
:>> CPU1 Temperature        125F (52.0C)
:>> CPU2 Temperature         95F (35.5C)
:>> FAN2                    314 RPMs (functional)
:>> FAN3                    312 RPMs (functional)
:>>
:>>
:>> HEALTHD:
:>>
:>> ************************
:>> * Hardware Information *
:>> ************************
:>> Asus: AS97127F
:>> ************************
:>>
:>> Temp.= 21.0, 52.0, 35.5; Rot.=    0, 5113, 4821
:>> Vcore = 2.08, 2.05; Volt. = 3.22, 4.92, 12.04, -11.77, -5.11
:>> Temp.= 21.0, 52.0, 35.5; Rot.=    0, 5357, 4963
:>> Vcore = 2.06, 2.05; Volt. = 3.22, 4.89, 12.04, -11.77, -5.11
:>> Temp.= 21.0, 52.0, 35.5; Rot.=    0, 5113, 4963
:>> Vcore = 2.06, 2.05; Volt. = 3.22, 4.89, 11.98, -11.77, -5.11
:>>
:>>
:>> yesterday these values were vise-versa with exchanged CPUs ...
:>>
:>>
:>> Please tell me your opinion: should I exchange mainboard first or the suspected CPU?
:>>
:>> Thanks,
:>>
:>> oliver
:>>
:>> --
:>> MfG
:>> O. Hartmann
:>>
:>> ohartman@klima.physik.uni-mainz.de
:>> ----------------------------------------------------------------
:>> IT-Administration des Institut fuer Physik der Atmosphaere (IPA)
:>> ----------------------------------------------------------------
:>> Johannes Gutenberg Universitaet Mainz
:>> Becherweg 21
:>> 55099 Mainz
:>>
:>> Tel: +496131/3924662 (Maschinensaal)
:>> Tel: +496131/3924144
:>> FAX: +496131/3923532
:>>
:>>
:>> To Unsubscribe: send mail to majordomo@FreeBSD.org
:>> with "unsubscribe freebsd-questions" in the body of the message
:>>
:>
:>

--
MfG
O. Hartmann

ohartman@klima.physik.uni-mainz.de
----------------------------------------------------------------
IT-Administration des Institut fuer Physik der Atmosphaere (IPA)
----------------------------------------------------------------
Johannes Gutenberg Universitaet Mainz
Becherweg 21
55099 Mainz

Tel: +496131/3924662 (Maschinensaal)
Tel: +496131/3924144
FAX: +496131/3923532


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hardware" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.33.0103280018070.591-100000>