Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Oct 2012 08:38:11 -0700
From:      Derek Kulinski <takeda@takeda.tk>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        freebsd-stable@FreeBSD.org, avg@FreeBSD.org
Subject:   Re: Problem reading vitals from Gigabyte H77-DH3H
Message-ID:  <35578786.20121022083811@takeda.tk>
In-Reply-To: <20121022130348.GA28302@icarus.home.lan>
References:  <20121022130348.GA28302@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello Jeremy,

Monday, October 22, 2012, 6:03:49 AM, you wrote:

> I'm not subscribed to the FreeBSD lists any longer, but I did come
> across this thread via the web:

> http://lists.freebsd.org/pipermail/freebsd-stable/2012-October/070169.html

> Either (or both) of you are free to bounce a copy of my Email here to
> the list if you feel it'd benefit others.

> I have a lot of familiarity with hardware monitoring chips and
> interfacing with them (as the author of ports/sysutils/bsdhwmon).

> The H/W monitoring chip on that Gigabyte motherboard is **not** the same
> or has resistors/pullups that differ from what the OpenBSD sensors
> framework code expects.  That is quite evident from the below.  There
> are also very likely labels that are wrong.  I'll get to explaining how
> to fix that properly further down.

> Let me explain in detail one section at a time:

>> hw.sensors.it0.volt0: 1,42 VDC (VCORE_A)
>> hw.sensors.it0.volt1: 2,72 VDC (VCORE_B)

> The term "Vcore" refers to the CPU core voltage.  This is a
> per-physical-CPU basis.  This software is assuming there's 2 physical
> CPUs (not cores, I'm talking about physical processors).

> VCORE_A may be correct (meaning 1.42V), however it depends on the CPU
> model.  Derek did not disclose this so I cannot tell you if 1.42V is
> considered "correct" or not.  Some models run at 1.2V, others 1.5V,
> others vary.

It is i5-3470 3.2GHz quad core (The entire component list I used to
build is here: http://pcpartpicker.com/p/koz3).
The CPU is not overclocked, I set "auto" for all this kind of settings
in the BIOS.

> VCORE_B is probably not VCORE_B at all.  However, worse: 2.72V does not
> look to be a correct/valid voltage no matter what (even if for an MCH or
> a southbridge).  So probably a calculation error or its reading the
> wrong bits from the chip.

>> hw.sensors.it0.volt2: 2,70 VDC (+3.3V)

> This is also wrong -- either the voltage or the label.  There is no way
> your system would be stable if a +3.3V line was at +2.7V.  So another
> calculation error or reading wrong bits from the chip.

>> hw.sensors.it0.volt3: 4,60 VDC (+5V)

> This is probably also wrong, but it's hard to say.  +5V is relied on
> heavily throughout the entire system, so a 0.4V drop is pretty damn
> major.  So probably another calculation error or reading wrong bits from
> the chip.

>> hw.sensors.it0.volt4: 0,06 VDC (+12V)

> This is flat out completely wrong on numerous levels.

>> hw.sensors.it0.volt5: -5,08 VDC (Unused)

> No idea.  This could be -5V monitoring, but it depends.  Only Gigabyte
> would know.

>> hw.sensors.it0.volt6: -6,53 VDC (-12V)

> Also totally wrong (voltage and label).  So another calculation error or
> reading wrong bits from the chip.

>> hw.sensors.it0.volt7: 3,74 VDC (+5VSB)

> Also totally wrong (voltage and/or label).  "+5Vsb" stands for "+5V
> standby"; it's the +5V line that comes off the PSU and is *always on*,
> even when the motherboard is off.  It's what allows systems to power
> back up from sleep state.  So another calculation error or reading wrong
> bits from the chip.

>> hw.sensors.it0.volt8: 2,14 VDC (VBAT)

> Also totally wrong (voltage and/or label).  "VBAT" refers to the voltage
> of the CMOS battery, which should be +3.3V.  So another calculation
> error or reading wrong bits from the chip.

> Here is what proper labels and a proper system should show, as an
> example:

> # bsdhwmon
> CPU1 Temperature           31 C
> System Temperature         35 C
> FAN1                        0 RPM
> FAN2                        0 RPM
> FAN3                        0 RPM
> FAN4                     2042 RPM
> FAN5                        0 RPM
> FAN6                     1875 RPM
> VcoreA                  1.106 V
> MCH Core                1.522 V
> -12V                  -12.288 V
> V_DIMM                  1.712 V
> +3.3V                   3.392 V
> +12V                   12.096 V
> 5Vsb                    5.070 V
> 5VDD                    5.118 V
> P_VTT                   1.142 V
> Vbat                    3.328 V

> The bottom line here is this: the problem with the sensors framework is
> that it has no concept of per-motherboard engineering (to my knowledge).
> Again, that is why I designed bsdhwmon the way I did -- I key off of
> SMBIOS string data because it's the only way to do things as reliable as
> possible.  Each motherboard model requires unique support.  Without
> that, voltage calculations are either wrong, or labels are completely
> wrong, or both.


> If I could get within the bowels of Gigabyte and actually talk to a
> **real engineer** and not tech support, I could find out if their
> GA-H77-DS3H motherboard has SMBus tie-ins for their H/W monitoring chip.
> If it does, I **absolutely** could add PROPER support for it to
> bsdhwmon.

> However, regardless of that, it also requires the owner of the
> motherboard to be able to run the monitoring software provided by the
> vendor for the board (usually Windows software) as a "baseline"
> comparison -- or -- take a screenshot of the hardware monitoring details
> in the BIOS (or UEFI system) for comparison.  Sometimes a VERY HIGH
> RESOLUTION photo of the motherboard is helpful -- though sometimes this
> isn't useful because motherboard vendors actually use "emulation modes"
> of their Super I/O chips (e.g. Chip Z is installed on the board, but
> it's configured to emulate Chip X which the Chip company made 2 years
> ago).  I've found this on many Supermicro boards actually -- what's
> silkscreened on the chips says one thing but how the chip *behaves* is
> another.

Not exactly a screenshot but I wrote down values given by BIOS:
CPU Vcore    1.044V
DRAM Voltage 1.524V
+3.3V        3.363V
+12V         12.168V
CPU Temp     33C
System Temp  30C

Please let me know if this is enough.

As for the picture of the motherboard, this one
(http://www.nix.ru/autocatalog/motherboards_gigabyte/135869_2245_draft.jpg)
looks way better than any of my picture.

It is revision 1.0. Gigabyte seems to have also rev 1.1, but 1.0 is
the one I use.

> But sometimes even WITH proper documentation from the vendor there are
> unexplained issues.  Two examples taken from bsdhwmon's doc/BUGS:

> Winbond W83792D: +5V Vcc is incorrect
> =======================================
> Currently, boards which use the Winbond W83792D H/W monitoring IC will
> have their +5V voltage shown incorrectly.

> I've mailed Supermicro to try and find out why the calculation formula
> is wrong (since what I'm following comes from Winbond), but as of this
> writing have received no response.

> I have also looked at the Linux lm-sensors project, but the code is
> quite "spaghetti" -- it's hard to discern what the calculation values
> are, and if they're the same for all W83792D systems.


> Winbond W83792D: FAN3 RPMs may be inaccurate/high
> ===================================================
> I've received a single (isolated) report involving a Supermicro P8SCi
> board reporting absurdly high values for FAN3.  Example:

> FAN1                        0 RPM
> FAN2                     2909 RPM
> FAN3                    84375 RPM
> FAN4                        0 RPM
> FAN5                        0 RPM

> Further executions of bsdhwmon did not exhibit this problem.  However,
> I take the report seriously, as it could indicate a strange bug in
> bsdhwmon, or possibly a bug in the Winbond W83792D chipset.  At this
> time I have not been able to determine the root cause, however the
> user had his fan RPM configuration in the system BIOS set to
> "3-pin Server" rather than "Disabled" (which runs the fans at full
> speed).  This could be a bug in the Winbond chipset, but I simply
> don't know.
> ------------

> I refuse to interface with Super I/O or H/W monitoring chips that use
> the classic ISA interface (/dev/io) because it's an extremely risky
> interface.  You can crash and lock up a system very very easily with
> this model.  The wrong I/O port or wrong bit set in the wrong sub-reg
> and pow, the system is in a weird state.  It's a lot more difficult with
> SMBus given the unique assignment of a slave device address per-device.

> Don't get me started on what Linux lm-sensors looks like either.  Good
> god what a mess.  Does it work?  Yeah, it works.  But it's just such a
> garbled mess of code and "configs" and some abstract strangeness.  It
> really doesn't read well, and is not commented good to boot.

> I wish I could help solve this in some way for you guys (without using
> sensors).  I've spent way too many years doing H/W monitoring "stuff",
> and concluded long ago that on FreeBSD H/W monitoring is absolutely
> doable but we need support from vendors on a per-motherboard basis.
> Supermicro happens to be one of the few vendors who is quite good about
> this, barring the Winbond W83792D +5V Vcc problem.

> The biggest problem: this kind of support/effort is quite literally a
> full-time job.  Finding/getting in contact with engineers deep within
> the bowels of companies is the hardest part.

> P.S. -- Question for Andriy: I thought it was established long ago that
> none of this monitoring should be done in the kernel?  Were you around
> when someone took the time to port the OpenBSD sensors framework to
> FreeBSD, and it resulted in a *massive* discussion and backlash from
> FreeBSD kernel committers stating "this should not go in the kernel?"
> Now I see this, and mention of an it(4) driver...?  What exactly is
> going on?  To put it in California-style: "dude.  This REALLY pisses me
> off.  WTF is going on over there?"




-- 
Best regards,
 Derek                            mailto:takeda@takeda.tk

Always remember you're unique, just like everyone else.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?35578786.20121022083811>