From owner-freebsd-stable Tue Mar 27 14: 2:14 2001 Delivered-To: freebsd-stable@freebsd.org Received: from klima.physik.uni-mainz.de (klima.Physik.Uni-Mainz.DE [134.93.180.162]) by hub.freebsd.org (Postfix) with ESMTP id A4BE237B71A for ; Tue, 27 Mar 2001 14:02:05 -0800 (PST) (envelope-from ohartman@klima.physik.uni-mainz.de) Received: from klima.Physik.Uni-Mainz.DE (klima.Physik.Uni-Mainz.DE [134.93.180.162]) by klima.physik.uni-mainz.de (8.11.3/8.11.3) with ESMTP id f2RM24u53815 for ; Wed, 28 Mar 2001 00:02:05 +0200 (CEST) (envelope-from ohartman@klima.physik.uni-mainz.de) Date: Wed, 28 Mar 2001 00:02:04 +0200 (CEST) From: "Hartmann, O." To: Subject: Overheated PIII in SMP system Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Dear Sirs. I need a little bit help and some technical hints. One of our SMP servers, a FreeBSD 4.-RC box, is not willing to compile a world. It faults with SIG 11. The problem is caused by a overheated Intel Pentium III/600MHz with KATMAI core. The main PCB is a ASUS P2B-D. Rear CPU slot is CPU 0. the inner slot is for CPU 1. Today I swapped both CPUs and equipted both with expensive fans and coolers. The case has a very, very good air circulation, the rear is ventilated by two additional 80mm collers. Secanrio: CPU 1, in the last configuration the inner one, has a temperature of about 32 -38 degrees Celsius. The outer one, CPU 0 has 50 degress and up!! Yesterday I took measuremnets with swapped CPUs, and figured out, that CPU 1, the inner one has 50 degrees and the outer one, CPU 0 has only 30 to 40 degress. I switched kernel from SMP to UP and ran the system while compiling a world. Both CPUs were not over 35 degress Celsius! Then I compiled a SMP kernel again and tryed to start make world. CPU 0, the outer one, has now constantly about 50 degress Celsius. Well, due the fact of changing fan and coller elements by better ones one fan is very close to the next SECC-2 case of CPU 1 (the inner one). This handicaped fan is for CPU 0, the overheated one. But yesterday exactly this CPU has the place of the "cooler" CPU, the inner one, so I think not that this could be a real heatsink problem. I think the problem has to be targeted either by the mainboard (maybe some kind of weakness in voltage regulation? But why only in SMP mode of the kernel and not in UP mode?). Or the CPU has some faults. I switched again to UP kernel to see, whether temeprature is decreasing or not. this is the actual output of "healthd -d -I" and "heat" as it reflects the configuration at this moment: HEAT: System Temperature 69F (21.0C) CPU1 Temperature 125F (52.0C) CPU2 Temperature 95F (35.5C) FAN2 314 RPMs (functional) FAN3 312 RPMs (functional) HEALTHD: ************************ * Hardware Information * ************************ Asus: AS97127F ************************ Temp.= 21.0, 52.0, 35.5; Rot.= 0, 5113, 4821 Vcore = 2.08, 2.05; Volt. = 3.22, 4.92, 12.04, -11.77, -5.11 Temp.= 21.0, 52.0, 35.5; Rot.= 0, 5357, 4963 Vcore = 2.06, 2.05; Volt. = 3.22, 4.89, 12.04, -11.77, -5.11 Temp.= 21.0, 52.0, 35.5; Rot.= 0, 5113, 4963 Vcore = 2.06, 2.05; Volt. = 3.22, 4.89, 11.98, -11.77, -5.11 yesterday these values were vise-versa with exchanged CPUs ... Please tell me your opinion: should I exchange mainboard first or the suspected CPU? Thanks, oliver -- MfG O. Hartmann ohartman@klima.physik.uni-mainz.de ---------------------------------------------------------------- IT-Administration des Institut fuer Physik der Atmosphaere (IPA) ---------------------------------------------------------------- Johannes Gutenberg Universitaet Mainz Becherweg 21 55099 Mainz Tel: +496131/3924662 (Maschinensaal) Tel: +496131/3924144 FAX: +496131/3923532 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message