Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Jan 2008 16:10:57 -0500 (EST)
From:      Daniel Eischen <deischen@freebsd.org>
To:        Kevin Oberman <oberman@es.net>
Cc:        acpi@freebsd.org
Subject:   Re: How to disable acpi thermal?
Message-ID:  <Pine.GSO.4.64.0801151606550.29868@sea.ntplx.net>
In-Reply-To: <20080115210206.849E24500E@ptavv.es.net>
References:  <20080115210206.849E24500E@ptavv.es.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 15 Jan 2008, Kevin Oberman wrote:

>> Date: Tue, 15 Jan 2008 15:34:41 -0500 (EST)
>> From: Daniel Eischen <deischen@freebsd.org>
>> Sender: owner-freebsd-acpi@freebsd.org
>>
>> [ Redirected from -current ]
>>
>> On Mon, 14 Jan 2008, Alexandre "Sunny" Kovalenko wrote:
>>
>>>
>>> On Mon, 2008-01-14 at 21:56 -0500, Daniel Eischen wrote:
>>>>
>>>> Thermal zone 0 skyrockets past 110C in a couple of minutes
>>>> when trying to build a kernel.  All the other zones stay
>>>> relatively static.  I suspect something is wrong somewhere
>>>> because this machine is very lightly loaded and has never
>>>> had a problem until now.  I just upgraded it from 4.x to
>>>> 7.0.
>>>
>>> It need not to be bogus -- if I turn off fan on my ThinkPad it will
>>> overheat and shut itself down within couple of minutes of buildworld,
>>> starting from the relative cool state. From the look of the stuff below
>>> your fan should kick in no later then 10 seconds after tz0 reached 77C.
>>> Do you hear it running before shutdown? If yes, maybe lowering threshold
>>> in AC0 down from 77C will help. If not -- you will need to figure out
>>> who is supposed to turn on the fan. You can dump your ASL (instructions
>>> in the handbook) and post it someplace accessible -- I will take a look
>>> and maybe spot something interesting, but, being far from the expert in
>>> the field, I do not promise too much.
>>
>> I posted the acpidump here:
>>
>>    http://people.freebsd.org/~deischen/stl2.iasl
>>
>> The problem is that acpi_thermal keeps shutting down the system
>> after 2 minutes into a buildkernel.  The system has no load other
>> than the buildkernel at the time it shuts down.
>>
>> The system is a Intel STL2 Tupelo motherboard with 1 CPU, the
>> other CPU socket being occupied by a CPU terminator thingy.
>> I uncovered the rackmount system and watched it while building
>> a kernel.  With the cover off the acpi monitored temperature
>> went to 107C and stayed there.  It only took a minute or two
>> to get there.  I felt around inside the chassis and nothing
>> was even near being to warm or hot.  With the cover on, the
>> temperature goes to 111/112C before being shutdown by acpi_thermal
>> (the limit being 110C).  There is no way anything in that
>> chassis is anywhere near 100C.  I've disabled acpi_thermal
>> for now, but it'd be nice to get a better fix.
>>
>> Any ideas?
>
> Bad CPU or bad support chip? The temperature on modern CPUs is measured
> on the silicon. There is usually a junction that is simply brought out
> to a pair of pins and an external device "reads" the temperature.
>
> It's possible that the chip has a bad junction or support chip that is
> providing bogus information. On most processors it looks like the
> thermal "crowbar" that will kill power if the temperature reaches about
> 135C or something near to that. (I have not looked at a spec sheet for
> any CPUs in about three years, so things might have changed. That is
> outside the control of acpi_thermal, so turning it off may remove alarms
> and prevent a shutdown at _CRT, but that won't prevent a shutdown at the
> higher "meltdown" temperature. That one is intended for loose/missing
> hear sinks or other major thermal failures.

We'll see, I'm doing a buildworld with acpi_thermal disabled,
but with it disabled I can no longer see what the monitored
temperature is.

> It is also possible that there is a BIOS bug that is reporting the
> temperature incorrectly. That seems less likely as it would probably be
> noticed by a lot of folks.
>
> Is there any chance that the heat sink is loose or improperly attached?
> (It happened to me a few years ago.)

Nope, I looked at it, felt it, etc.  The CPU isn't hot at all.
110 is well past the boiling point, so I should be able to feel
at least some heat from and around the CPU if it was really
running hot.

-- 
DE



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.64.0801151606550.29868>