[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5861970.DvuYhMxLoT@rjwysocki.net>
Date: Fri, 29 Nov 2024 16:55:12 +0100
From: "Rafael J. Wysocki" <rjw@...ysocki.net>
To: Linux PM <linux-pm@...r.kernel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Lukasz Luba <lukasz.luba@....com>,
Peter Zijlstra <peterz@...radead.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <morten.rasmussen@....com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
Pierre Gondois <pierre.gondois@....com>
Subject:
[RFC][PATCH v021 0/9] cpufreq: intel_pstate: Enable EAS on hybrid platforms
without SMT
Hi Everyone,
This is a new iteration of the "EAS for intel_pstate" work:
https://lore.kernel.org/linux-pm/3607404.iIbC2pHGDl@rjwysocki.net/
It contains a few new patches and almost all of the patches sent previously
have been updated.
The following paragraph from the original cover letter still applies:
"The underlying observation is that on the platforms targeted by these changes,
Lunar Lake at the time of this writing, the "small" CPUs (E-cores), when run at
the same performance level, are always more energy-efficient than the "big" or
"performance" CPUs (P-cores). This means that, regardless of the scale-
invariant utilization of a task, as long as there is enough spare capacity on
E-cores, the relative cost of running it there is always lower."
Thus the idea is still to register a perf domain per CPU type, but this time
there may be more than just two of them because of the first patch.
The states table in each of these perf domains is still one-element and that
element only contains the cost value, but this time the costs are computed
and not prescribed (see the last patch). Nevertheless, the expected effect
is still that the perf domains (or CPU types) with lower cost values will
be preferred so long as there is enough spare capacity in them.
The first two patches are not really RFC, but they are included here because
patches [8-9/9] depend on patch [1/9]. They will be resent next week as
non-RFC 6.14-candidate material.
The difference made by them is significant because it is now not known in
advance how many CPU types will be there and the cost values for each of
them cannot be prescribed.
Patch [3/9] is also a change that I'd like to make regardless of what
happens to the rest of the series because it effectively moves EM code
from the schedutil governor to EM where it belongs. Of course, it is also
depended on by patch [9/9].
Patch [4/9] differs from its previous version,
https://lore.kernel.org/linux-pm/1889415.atdPhlSkOF@rjwysocki.net/
because gov is NULL not only when it is not used at all, but also during the
cpufreq policy init and exit, so the check in the patch had to be adjusted
to match the former case only. [As a side note, I don't think that the code
modified by patch [4/9] belongs to sched/topology as it messes around the
cpufreq internals. At least, it should be moved to cpufreq and called by
sched_is_eas_possible(), but I'm also not convinced that it is necessary
at all. This is not directly related to the $subject series, though.]
Patch [5/9] adds a new function needed by patch [9/9] and it is the same as
its previous version:
https://lore.kernel.org/linux-pm/2223963.Mh6RI2rZIc@rjwysocki.net/
Patch [6/9] is almost the same as its previous version:
https://lore.kernel.org/linux-pm/1821040.VLH7GnMWUR@rjwysocki.net/
but its changelog has been expanded a bit as suggested by Dietmar. It
simply rearranges the EM code without changing its functionality, so the
next patch looks more straightforward.
Patch [7/9] is a somewhat updated counterpart of
https://lore.kernel.org/linux-pm/2017201.usQuhbGJ8B@rjwysocki.net/
It still changes the EM code to allow a perf domains with one-element states
table to be registered without providing the :active_power() callback (which
is then done in the last patch), but it is somewhat simpler. It also
contains some discussion regarding the requirement that the capacity of
all CPUs in a perf domain must be the same. In a short summary, I'm not
convinced that it is actually valid.
Patches [8-9/9] modify intel_pstate. The first one is preparatory, but it
is useful for explaining the basic concept, which is "hybrid domains" that
each contain CPUs of the same type.
The last patch is just the registration of EM perf domains (one for each hybrid
domain), expanding them when needed and rebuilding sched domains in some corner
cases. It also contains some discussion that doesn't technically belong to the
changelog, but is useful for explaining the background for some decisions.
Please refer to the individual patch changelogs for details.
For easier access, the series is available on the experimental/intel_ostate
branch in linux-pm.git:
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=experimental/intel_pstate
Thanks!
or
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=experimental/intel_pstate
Thanks!
Powered by blists - more mailing lists