From owner-freebsd-stable@FreeBSD.ORG Sat Mar 16 18:24:24 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C831BAA6; Sat, 16 Mar 2013 18:24:24 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (gw.catspoiler.org [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id A71BAD98; Sat, 16 Mar 2013 18:24:24 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id r2GIO53r006067; Sat, 16 Mar 2013 10:24:09 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <201303161824.r2GIO53r006067@gw.catspoiler.org> Date: Sat, 16 Mar 2013 11:24:05 -0700 (PDT) From: Don Lewis Subject: Re: amdtemp does not find my CPU. To: jim@ohlste.in In-Reply-To: <5144A423.2060007@ohlste.in> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: jdc@koitsu.org, rpaulo@FreeBSD.org, zkolic@sbb.rs, nork@FreeBSD.org, freebsd-stable@FreeBSD.org, jkim@FreeBSD.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Mar 2013 18:24:24 -0000 On 16 Mar, Jim Ohlstein wrote: > On 3/16/13 2:20 AM, Jeremy Chadwick wrote: >> On Fri, Mar 15, 2013 at 03:16:19PM -0400, Jim Ohlstein wrote: >>> On 3/15/13 12:15 PM, Zoran Kolic wrote: >>>> After I installed 9.1 amd64 on node with amd 8120, >>>> I was not able to read temperatures out of the box. >>>> I fetched source for head module and compiled. And >>>> loaded module. Still nothing. I assume my cpu is >>>> a bit different. >>>> Best regards >>> >>> The module from head "works" for me with an 8120 on 9.1 stable (r247893) >>> though the results are inconsistent. I am not certain of how useful they >>> are. >>> >>> # sysctl hw.model >>> hw.model: AMD FX(tm)-8120 Eight-Core Processor >>> >>> # kldstat | grep amd >>> 5 1 0xffffffff8183e000 1043 amdtemp.ko >>> >>> # sysctl -a | grep dev.amdtemp >>> dev.amdtemp.0.%desc: AMD CPU On-Die Thermal Sensors >>> dev.amdtemp.0.%driver: amdtemp >>> dev.amdtemp.0.%parent: hostb4 >>> dev.amdtemp.0.sensor_offset: 0 >>> dev.amdtemp.0.core0.sensor0: 47.7C >>> >>> Here are results taken at 0.1 second intervals using a shell script: >>> >>> dev.amdtemp.0.core0.sensor0: 42.1C >>> dev.amdtemp.0.core0.sensor0: 42.2C >>> dev.amdtemp.0.core0.sensor0: 42.0C >>> dev.amdtemp.0.core0.sensor0: 42.1C >>> dev.amdtemp.0.core0.sensor0: 41.8C >>> dev.amdtemp.0.core0.sensor0: 41.7C >>> dev.amdtemp.0.core0.sensor0: 51.1C >>> dev.amdtemp.0.core0.sensor0: 51.0C >>> dev.amdtemp.0.core0.sensor0: 50.7C >>> dev.amdtemp.0.core0.sensor0: 50.5C >>> dev.amdtemp.0.core0.sensor0: 50.1C >>> dev.amdtemp.0.core0.sensor0: 49.8C >>> dev.amdtemp.0.core0.sensor0: 49.5C >>> dev.amdtemp.0.core0.sensor0: 49.2C >>> dev.amdtemp.0.core0.sensor0: 49.2C >>> >>> >>> and again: >>> >>> dev.amdtemp.0.core0.sensor0: 41.5C >>> dev.amdtemp.0.core0.sensor0: 41.2C >>> dev.amdtemp.0.core0.sensor0: 40.8C >>> dev.amdtemp.0.core0.sensor0: 40.8C >>> dev.amdtemp.0.core0.sensor0: 41.0C >>> dev.amdtemp.0.core0.sensor0: 41.3C >>> dev.amdtemp.0.core0.sensor0: 41.6C >>> dev.amdtemp.0.core0.sensor0: 41.3C >>> dev.amdtemp.0.core0.sensor0: 54.0C >>> dev.amdtemp.0.core0.sensor0: 53.7C >>> dev.amdtemp.0.core0.sensor0: 53.3C >>> dev.amdtemp.0.core0.sensor0: 53.1C >>> dev.amdtemp.0.core0.sensor0: 52.7C >>> dev.amdtemp.0.core0.sensor0: 52.3C >>> dev.amdtemp.0.core0.sensor0: 52.1C >>> dev.amdtemp.0.core0.sensor0: 51.7C >>> dev.amdtemp.0.core0.sensor0: 51.5C >>> >>> You can see during each series there are sudden increases of over 9C and >>> almost 13C respectively. >>> >>> The same effect is seen if I track any of the individual cores with >>> "dev.cpu.[0-7].temperature". Here's an example with a 9C jump in 0.1 second. >>> >>> dev.cpu.3.temperature: 41.5C >>> dev.cpu.3.temperature: 41.5C >>> dev.cpu.3.temperature: 41.7C >>> dev.cpu.3.temperature: 41.7C >>> dev.cpu.3.temperature: 41.3C >>> dev.cpu.3.temperature: 41.0C >>> dev.cpu.3.temperature: 40.7C >>> dev.cpu.3.temperature: 49.8C >>> dev.cpu.3.temperature: 49.5C >>> dev.cpu.3.temperature: 49.2C >>> dev.cpu.3.temperature: 48.8C >>> dev.cpu.3.temperature: 48.6C >>> dev.cpu.3.temperature: 48.2C >>> dev.cpu.3.temperature: 48.0C >>> >>> I don't have hands on access to this box as it's in a datacenter 1000 >>> miles from me, but the techs there had a look and all "seems to be OK". >> >> 1. While it's certainly possible the DTS reading routines and/or the >> calculation formulas may be wrong in amdtemp(4), possibly for your model >> of CPU, it is also certainly possible that what you're seeing is normal >> and fully justified. This is especially the case for the >> dev.cpu.X.temperature nodes on the K8 family. >> >> Respectfully, not combatively nor dismissively: you've not provided a >> comparison base to prove there's an issue. You would need to provide >> data from Linux (I forget what daemon/tool they have to get this) or >> Windows (Core Temp). > > Respectfully, not combatively nor dismissively: I hadn't attempted to > "prove" anything. I said: "I am not certain of how useful they [the > readings] are.". I had merely provided some observational data as an > aside to the fact that yes, indeed, the module provides readings for me > on the 8120 This was in direct response to to Zoran's issue with this > module and that processor model. > > This started, for me, when I looked at a graph of average core > temperatures taken at 30 second intervals on two different machines > using Zabbix. The fluctuations were visibly (I know that's not > scientific "proof") more wild than on this server than on another using > the amdtemp module from 9 stable. > > I don't have access to another server with this model CPU on any other > OS, or even on this OS, so I cannot provide the data to "prove" this is > an issue according to your criteria. However, I will provide comparative > data from the other machine with the module from stable and with the the > module from head. > > > Full data taken now: > > # sysctl hw.model > hw.model: AMD FX(tm)-8120 Eight-Core Processor > > Using the module from head: > > http://pastebin.com/wqQ0FLq3 > > Note the big change between lines 34 and 35. My FX-4100 behaves the same way. I noticed it because on an idle system sysctl -a | grep amdtemp would read consistently higher than sysctl dev.amdtemp I think the thermal sensor in this AMD CPU family has a much faster response time so that it is more sensitive to temperature changes caused by CPU load over the short term. Going from idle to 100% CPU load for even 0.1 seconds and then back to idle is likely to change the die temperature a lot, but will probably only have a negligible effect on the heat sink temperature. The faster response time may be needed to support AMD Turbo CORE.