Date: Mon, 21 Jan 2008 22:35:18 -0500 From: "Alexandre \"Sunny\" Kovalenko" <alex.kovalenko@verizon.net> To: Daniel Eischen <deischen@freebsd.org> Cc: acpi@freebsd.org Subject: Re: How to disable acpi thermal? Message-ID: <1200972918.897.47.camel@RabbitsDen> In-Reply-To: <Pine.GSO.4.64.0801211732040.805@sea.ntplx.net> References: <Pine.GSO.4.64.0801142156360.24324@sea.ntplx.net> <1200369199.2054.38.camel@RabbitsDen> <Pine.GSO.4.64.0801151525160.29868@sea.ntplx.net> <1200844521.33164.18.camel@RabbitsDen> <Pine.GSO.4.64.0801211510290.805@sea.ntplx.net> <Pine.GSO.4.64.0801211732040.805@sea.ntplx.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--=-JZCI1fVh8WO43FmrSOtC Content-Type: text/plain Content-Transfer-Encoding: 7bit On Mon, 2008-01-21 at 17:39 -0500, Daniel Eischen wrote: > On Mon, 21 Jan 2008, Daniel Eischen wrote: > > > On Sun, 20 Jan 2008, Alexandre "Sunny" Kovalenko wrote: > > > >> > >> On Tue, 2008-01-15 at 15:34 -0500, Daniel Eischen wrote: > >>> [ Redirected from -current ] > >>> > >>> > >>> I posted the acpidump here: > >>> > >>> http://people.freebsd.org/~deischen/stl2.iasl > >>> > >>> The problem is that acpi_thermal keeps shutting down the system > >>> after 2 minutes into a buildkernel. The system has no load other > >>> than the buildkernel at the time it shuts down. > >>> > >>> The system is a Intel STL2 Tupelo motherboard with 1 CPU, the > >>> other CPU socket being occupied by a CPU terminator thingy. > >>> I uncovered the rackmount system and watched it while building > >>> a kernel. With the cover off the acpi monitored temperature > >>> went to 107C and stayed there. It only took a minute or two > >>> to get there. I felt around inside the chassis and nothing > >>> was even near being to warm or hot. With the cover on, the > >>> temperature goes to 111/112C before being shutdown by acpi_thermal > >>> (the limit being 110C). There is no way anything in that > >>> chassis is anywhere near 100C. I've disabled acpi_thermal > >>> for now, but it'd be nice to get a better fix. > >>> > >>> Any ideas? > >>> > >> Firstly, sorry for the delay in answer -- daytime job decided to kick in > >> with the vengeance. > >> > >> I took a look at the ASL and it does seem that this thing has embedded > >> controller and that is where _TMP method gets its temperature reading > >> from (this being conditional on the CPU present in the socket -- > >> otherwise you get 5 degrees Celsius, hardcoded in the ASL). > >> > >> So the questions are: > >> > >> -- does temperature in TZ2 grow over time as well? (TZ1 should stay at > >> 5C all the time). > > > > No, it stays around the same. I saw it go to 38 from 35 in > > the same time that TZ0 went to over 110C. I didn't see it > > get any higher than that. That is what bothers me more then slightly -- _TMP methods for tz0 and tz2 (see more on tz1 below) call the same function (EGTV) with the different first parameter. As far as I can tell (and I did mention before that I am not an expert in the area) this value is, in turn, populated in one of the EC registers and then values are read from other EC registers and given back to the caller as temperature, AC0 value and CRT value respectively. Since the call path is identical in the both cases it is quite possible that erratic reading is coming from the actual sensor as someone in this thread suggested. I was hoping that we would be able to follow call trace in the debug ACPI output, but apparently, I do not remember it that well (I was playing these games about two years ago). I will see if I can cobble together necessary combination of level and layer settings here before asking you to do anything else -- I do apologize for not doing my homework properly. > > One additional note, this is a dual CPU system with only one > CPU in it, and I am not running an SMP kernel. I was looking > at the iasl, and noticed this for TZ0: > > ThermalZone (TZC0) > { > Method (_TMP, 0, NotSerialized) > { > --> If (LNotEqual (And (\_SB.NCPU, 0x01), 0x01)) > { > Return (\_SB.PCI0.ISA0.EC0.TC2K (0x05)) > } > Else > { > Store (\_SB.PCI0.ISA0.EC0.EGTV (0x21, 0x00), TZT0) > If (LEqual (TZT0, CTC0)) > { > Add (TZT0, 0x0A, TZT0) > } > > Return (TZT0) > } > } > > Is it possible that my configuration with only one CPU > is confusing things? > As far as I can judge from the similar code in TZC1 and stable 5C temperature in the corresponding thermal zone (tz1), this merely checks presence of the chip in the socket and returns stable (and bogus) temperature when there is none. If your system is capable of running with the CPU in socket 1 and placeholder in the socket 0, I would suspect that your tz0 will be stuck at 5C and your tz1 will demonstrate some dynamics. On the slightly different note -- if you don't mind exploring another potential dead end, I have attached patch for your ASL which fixes the situation when _OFF method of one of the fans grabs mutex and never releases it. You can recompile your ASL and override it on boot. No promises though ;) -- Alexandre "Sunny" Kovalenko --=-JZCI1fVh8WO43FmrSOtC Content-Disposition: attachment; filename=stl2.iasl.patch Content-Type: text/x-patch; name=stl2.iasl.patch; charset=utf-8 Content-Transfer-Encoding: 7bit --- stl2.iasl.orig 2008-01-21 22:19:56.000000000 -0500 +++ stl2.iasl 2008-01-21 22:24:49.000000000 -0500 @@ -105,7 +105,7 @@ * Creator ID "MSFT" * Creator Revision 0x0100000A (16777226) */ -DefinitionBlock ("/tmp/acpidump.aml", "DSDT", 1, "INTEL ", "024B ", 0x00000001) +DefinitionBlock ("stl2.aml", "DSDT", 1, "INTEL ", "024B ", 0x00000001) { Name (TBUF, Buffer (0x04) { @@ -5152,6 +5152,7 @@ Store (0x01, Local0) } + Release (ECX0) Return (Local0) } Else --=-JZCI1fVh8WO43FmrSOtC--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1200972918.897.47.camel>