Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 02 Jul 2012 14:17:52 -0700
From:      Sean Bruno <seanbru@yahoo-inc.com>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        "freebsd-acpi@freebsd.org" <freebsd-acpi@freebsd.org>
Subject:   [rethinking] Re: [CFT] Sparse Cstate Support -- Its possible, that I don't know what I'm doing.
Message-ID:  <1341263872.3342.39.camel@powernoodle.corp.yahoo.com>
In-Reply-To: <4FF1D7A7.6000906@FreeBSD.org>
References:  <1340756603.2898.12.camel@powernoodle.corp.yahoo.com> <4FF1D7A7.6000906@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

> > 
> > http://people.freebsd.org/~sbruno/acpi_cpu_cstate_sparse.txt
> 
> Some sort of a review...
> 
> First, going over what seems to be cosmetic changes in the patch.
> 
> The change to etc/rc.d/power_profile seems to be a NOP, is it needed?
> 
Ugh, well ... this was *supposed* to be a change to cx_lowest, not
dev.cpu.0.freq.  So, let me change that.  +1 pointyhat to me.

> New comment in the first hunk of acpi_cpu.c seems to be redundant and its
> placement violates style(9).

Right.  *sigh* I'll adjust that.

> 
> What do we achieve by renaming cpu_cx_lowest static file-scope variable to
> global_lowest_cstate?  I am not necessarily against the change, but it  needs to
> be justified and it should likely be made in a separate commit.  Besides it breaks
> an apparent (and perhaps useless) naming convention of prefixing everything with
> "cpu_".
> 
Nothing gained other than my sanity.  I suspect what I really want to
do, is change the naming of all the global override values for ease of
reading, but for now this is trivial and I'll revert it.

> Other comments that you seem to have added as notes to self while working on the
> code - they are good, but better be done in a separate comments-only commit.
> 
> Now to the essence of the patch.
> 
> acpi_cpu_notify hunk - I am not sure if I completely follow the logic of the
> current code in that function.  For some reason that I can not grasp it doesn't
> update per-CPU cpu_cx_lowest if it is (was) equal or greater than the global
> cpu_cx_lowest limit.  Otherwise the per-CPU limit is set to the shallower of the
> global limit and the deepest available state.

If I'm reading the code correctly, I'm assuming that each CPU will be
notified via the ACPI handler here (denoted by the ACPI_SERIAL_BEGIN)
and update its own cpu_cx_lowest via acpi_cpu_cx_cst/list()

If the *new* lowest is greater than the global value, then we assume
that the *old* global lowest is still valid and move on.  I don't
*think* that this is a problem currently.  However if the per cpu list
of cstates no longer has a value that corresponds to cpu_cx_lowest, I
think that this *could* be a problem.

> The patch seems to do something that make less sense in the latter case, it tries
> to set per-CPU cpu_cx_lowest based on a new C-state that resides at old per-CPU
> cpu_cx_lowest index.  Essentially this means that (1) cpu_cx_lowest maybe point to
> a junk portion of cpu_cx_states after C-states change; (2) whatever the change, if
> we do not run into (1), then cpu_cx_lowest value won't change and the lowest
> C-state would be whichever happens to be at that index.

Ok, so the current method of using an index will always work to do
something valid as opposed to my proposal which has the potential to go
into a garbage part of the array and use values that are meaningless.  I
suspect that if I went from AC adapter to battery I would see some odd
behavior on my laptop if I was to put the proper debug flags into place
as the notify handler messed with the cpu_cx_states[] array per cpu.
*sigh* ... back to the drawing board. 


> 
> About change in acpi_cpu_set_cx_lowest - I am not sure that it should fail if the
> exact match can not be found.  Maybe cx_lowest should be set to the best match in
> that case?  Where best match is the deepest state that is not deeper than the
> requested state.

I would prefer it to error in that case.  I personally wouldn't want the
code to do something "smart" in this case on my servers.  I'd rather
know that the Cstate isn't supported and look to see why.  Most users
aren't tweaking this directly, but it is being adjusted by
devd/power_profile, so it should *know* the correct Cx state.

> 
> Also, the ACPI spec permits multiple _CST entries with the same C-state type.  Not
> sure if this ever happens in practice but wouldn't rule it out.
> Previously they would be disambiguated by their indexes (perhaps incorrectly from
> ACPI point of view).  I think that with the proposed change we need to have some
> policy on which of e.g. four C3 states to select as _the_ C3 state.

This never occurred to me to be honest.  I had to go back and read chap
8.4.2 of the ACPI spec to see what they were up to.  I see no good
reason for this to exist, but it is a possible configuration according
to 8.4.2.1 ... This pains me a bit. :-)

So, with that being said.  This pretty much kills my entire
implementation.  :-)  I didn't expect there to be multiple _CST entries
for a single Cx state.  Amusing.

This also means that the control of the cx_lowest value is a bit
trickier.  If the user asks for C3, which one do we give them?  :-)

Also, I'm not clear that the code as it exists can handle this case
gracefully.  I'll ponder this one a bit.  It would be good to find a box
that does this to see how it behaves.

> 
> acpi_cpu_global_cx_lowest_sysctl change - unlike the current code where
> acpi_cpu_set_cx_lowest never fails, the new code may change cx_lowest on some CPUs
> and leave other CPUs un-updated in the case of an error.

I thought of this when writing the patch.  I'm not clear why one cpu
would have a different cx_lowest than another.  If you have an idea of a
use case, can you elaborate a bit?

> 
> Also, it seems that with this patch cpu_cx_count becomes a write-only variable.
> Maybe it should be removed or maybe it could be re-purposed/replaced with a
> deepest Cx type available.  So that the sysctl handlers could check against this
> value like they did before (instead of MAX_CX_STATES).

I can think through this a bit more.  I put the check against
MAX_CS_STATES as a place holder for the removal of the "val >
cpu_cx_count -1" check in acpi_cpu_global_cx_lowest_sysctl() as I wanted
to validate that I could indeed set cpu_cx_lowest = val before asserting
that it had been done.

As far as it being write-only, its still used in other places in the
code for comparison and decisions.  

> 
> I think that the patch is not ready yet to be committed.
> Perhaps we need to discuss some things before the next review request.
> E.g. how the following three should interact: manually set global Cx limit,
> manually set Cx level for an individual CPU (thus overriding global limit) and
> ACPI platform change of available Cx levels.
> 
Agreed.  We shall have to discuss further and see where this leads us.
Its starting to look like some amount of further overhaul may be
required.

> P.S.
> A diff produced with svn diff -x -p is much more convenient to review than plain
> svn diff.
> 

right, duh.  I'll do this in the future once I figure out how to
proceed.

Sean




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1341263872.3342.39.camel>