[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <85664630200925cd75fba523adf5e78e295a3945.camel@suse.cz>
Date: Wed, 13 Oct 2021 18:23:11 +0200
From: Giovanni Gherdovich <ggherdovich@...e.cz>
To: Huang Rui <ray.huang@....com>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Shuah Khan <skhan@...uxfoundation.org>,
Borislav Petkov <bp@...e.de>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>, linux-pm@...r.kernel.org
Cc: Deepak Sharma <deepak.sharma@....com>,
Alex Deucher <alexander.deucher@....com>,
Mario Limonciello <mario.limonciello@....com>,
Nathan Fontenot <nathan.fontenot@....com>,
Jinzhou Su <Jinzhou.Su@....com>,
Xiaojian Du <Xiaojian.Du@....com>,
linux-kernel@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCH v2 21/21] Documentation: amd-pstate: add amd-pstate
driver introduction
On Sun, 2021-09-26 at 17:06 +0800, Huang Rui wrote:
> Introduce the amd-pstate driver design and implementation.
>
> Signed-off-by: Huang Rui <ray.huang@....com>
> ---
> Documentation/admin-guide/pm/amd_pstate.rst | 377 ++++++++++++++++++
>
[... snip ...]
> +
> +AMD CPPC Performance Capability
> +--------------------------------
> +
> +Highest Performance (RO)
> +.........................
> +
> +It is the absolute maximum performance an individual processor may reach,
> +assuming ideal conditions. This performance level may not be sustainable
> +for long durations and may only be achievable if other platform components
> +are in a specific state; for example, it may require other processors be in
> +an idle state. This would be equivalent to the highest frequencies
> +supported by the processor.
> +
> +Nominal (Guaranteed) Performance (RO)
> +......................................
> +
> +It is the maximum sustained performance level of the processor, assuming
> +ideal operating conditions. In absence of an external constraint (power,
> +thermal, etc.) this is the performance level the processor is expected to
> +be able to maintain continuously. All cores/processors are expected to be
> +able to sustain their nominal performance state simultaneously.
> +
> +Lowest non-linear Performance (RO)
> +...................................
> +
> +It is the lowest performance level at which nonlinear power savings are
> +achieved, for example, due to the combined effects of voltage and frequency
> +scaling. Above this threshold, lower performance levels should be generally
> +more energy efficient than higher performance levels. This register
> +effectively conveys the most efficient performance level to ``amd-pstate``.
> +
> +Lowest Performance (RO)
> +........................
> +
> +It is the absolute lowest performance level of the processor. Selecting a
> +performance level lower than the lowest nonlinear performance level may
> +cause an efficiency penalty but should reduce the instantaneous power
> +consumption of the processor.
> +
Those above are the CPPC capabilities. All good so far. They're Read Only, and
for each capability you have a file in sysfs. It makes sense to describe them
in this Documentation folder ("admin-guide"). But the following section...
> +AMD CPPC Performance Control
> +------------------------------
> +
> +``amd-pstate`` passes performance goals through these registers. The
> +register drives the behavior of the desired performance target.
> +
> +Minimum requested performance (RW)
> +...................................
> +
> +``amd-pstate`` specifies the minimum allowed performance level.
> +
> +Maximum requested performance (RW)
> +...................................
> +
> +``amd-pstate`` specifies a limit the maximum performance that is expected
> +to be supplied by the hardware.
> +
> +Desired performance target (RW)
> +...................................
> +
> +``amd-pstate`` specifies a desired target in the CPPC performance scale as
> +a relative number. This can be expressed as percentage of nominal
> +performance (infrastructure max). Below the nominal sustained performance
> +level, desired performance expresses the average performance level of the
> +processor subject to hardware. Above the nominal performance level,
> +processor must provide at least nominal performance requested and go higher
> +if current operating conditions allow.
> +
> +Energy Performance Preference (EPP) (RW)
> +.........................................
> +
> +Provides a hint to the hardware if software wants to bias toward performance
> +(0x0) or energy efficiency (0xff).
The section above describes the CPPC "performance controls". They're marked
"Read/Write", but you don't expose them to the user via sysfs, am I right?
Do I understand correctly that with this driver, the AMD System Management
Unit (SMU -- is it the right name?) is *not* working in autonomous mode, but
is almost entirely under the OS control?
By "autonomous mode" I mean: you run a workload, the driver doesn't select any
desired frequency, and the SMU does its thing and selects the CPU clock freq
on its own. That's not what's happing here, AFAIU. I tried using amd-pstate
using the "userspace" governor (very useful for testing ;), and set
frequencies like
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy11/scaling_setspeed
and then, whatever the load on CPU#11, "cpupower monitor" would show me a
constant clock of ~1.2GHz.
Don't get me wrong, this is a very good driver! I'm super happy that the
kernel can finally see all the P-States, instead of just 3.
I'm just trying to clarify that we're using CPPC with autonomous selection
disabled, so I don't think the documentation in admin-guide should describe
features like the R/W "performance controls" that don't make sense in this
context. Especially the "Energy Performance Preference (EPP)", that you would
use to tell the SMU "do what you want, just push a little on the performance
side".
I can see that the driver, internally, is sending "lowest nonlinear" as
minimum perf, 255 as maximum perf, and whatever the governor wants as desired
perf. It just isn't exposed in sysfs so there isn't much point in documenting
that.
> [...]
> Full MSR Support
> -----------------
>
> Some new Zen3 processors such as Cezanne provide the MSR registers directly
> while the :c:macro:`X86_FEATURE_AMD_CPPC_EXT` CPU feature flag is set.
> ``amd-pstate`` can handle the MSR register to implement the fast switch
> function in ``CPUFreq`` that can shrink latency of frequency control on the
> interrupt context.
A-ha! Cezanne. I have an EPYC Milan, so that's probably why I can't get the
"Full MSR Support". I'll test the "Shared Memory Support" then, and report my
data.
Thanks!
Giovanni
Powered by blists - more mailing lists