[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ed9015a3-42b5-4c0e-af6f-2b4d65c34cd5@arm.com>
Date: Thu, 8 Jan 2026 17:46:40 +0100
From: Pierre Gondois <pierre.gondois@....com>
To: Sumit Gupta <sumitg@...dia.com>, rafael@...nel.org,
viresh.kumar@...aro.org, zhenglifeng1@...wei.com
Cc: linux-tegra@...r.kernel.org, linux-pm@...r.kernel.org, ray.huang@....com,
corbet@....net, robert.moore@...el.com, lenb@...nel.org,
acpica-devel@...ts.linux.dev, mario.limonciello@....com,
rdunlap@...radead.org, linux-kernel@...r.kernel.org, gautham.shenoy@....com,
zhanjie9@...ilicon.com, ionela.voinescu@....com, perry.yuan@....com,
linux-doc@...r.kernel.org, linux-acpi@...r.kernel.org, treding@...dia.com,
jonathanh@...dia.com, vsethi@...dia.com, ksitaraman@...dia.com,
sanjayc@...dia.com, nhartman@...dia.com, bbasu@...dia.com
Subject: Re: [PATCH v5 10/11] cpufreq: CPPC: make scaling_min/max_freq
read-only when auto_sel enabled
Hello Sumit, Lifeng,
On 12/23/25 13:13, Sumit Gupta wrote:
> When autonomous selection (auto_sel) is enabled, the hardware controls
> performance within min_perf/max_perf register bounds making the
> scaling_min/max_freq effectively read-only.
If auto_sel is set, the governor associated to the policy will have no
actual control.
E.g.:
If the schedutil governor is used, attempts to set the
frequency based on CPU utilization will be periodically
sent, but they will have no effect.
The same thing will happen for the ondemand, performance,
powersave, userspace, etc. governors. They can only work if
frequency requests are taken into account.
------------
This looks like the intel_pstate governor handling where it is possible
not to have .target() or .target_index() callback and the hardware is in
charge (IIUC).
For this case, only 2 governor seem available: performance and powersave.
------------
In our case, I think it is desired to unload the scaling governor
currently in
use if auto_sel is selected. Letting the rest of the system think it has
control
over the freq. selection seems incorrect.
I am not sure what to replace it with:
-
There are no specific performance/powersave modes for CPPC.
There is a range of values between 0-255
-
A firmware auto-selection governor could be created just for this case.
Being able to switch between OS-driven and firmware driven freq. selection
is not specific to CPPC (for the future).
However I am not really able to say the implications of doing that.
------------
I think it would be better to split your patchset in 2:
1. adding APIs for the CPPC spec.
2. using the APIs, especially for auto_sel
1. is likely to be straightforward as the APIs will still be used
by the driver at some point.
2. is likely to bring more discussion.
> Enforce this by setting policy limits to min/max_perf bounds in
> cppc_verify_policy(). Users must use min_perf/max_perf sysfs interfaces
> to change performance limits in autonomous mode.
>
> Signed-off-by: Sumit Gupta <sumitg@...dia.com>
> ---
> drivers/cpufreq/cppc_cpufreq.c | 32 +++++++++++++++++++++++++++++++-
> 1 file changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index b1f570d6de34..b3da263c18b0 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -305,7 +305,37 @@ static unsigned int cppc_cpufreq_fast_switch(struct cpufreq_policy *policy,
>
> static int cppc_verify_policy(struct cpufreq_policy_data *policy)
> {
> - cpufreq_verify_within_cpu_limits(policy);
> + unsigned int min_freq = policy->cpuinfo.min_freq;
> + unsigned int max_freq = policy->cpuinfo.max_freq;
> + struct cpufreq_policy *cpu_policy;
> + struct cppc_cpudata *cpu_data;
> + struct cppc_perf_caps *caps;
> +
> + cpu_policy = cpufreq_cpu_get(policy->cpu);
> + if (!cpu_policy)
> + return -ENODEV;
> +
> + cpu_data = cpu_policy->driver_data;
> + caps = &cpu_data->perf_caps;
> +
> + if (cpu_data->perf_ctrls.auto_sel) {
> + u32 min_perf, max_perf;
> +
> + /*
> + * Set policy limits to HW min/max_perf bounds. In autonomous
> + * mode, scaling_min/max_freq is effectively read-only.
> + */
> + min_perf = cpu_data->perf_ctrls.min_perf ?:
> + caps->lowest_nonlinear_perf;
> + max_perf = cpu_data->perf_ctrls.max_perf ?: caps->nominal_perf;
> +
> + policy->min = cppc_perf_to_khz(caps, min_perf);
> + policy->max = cppc_perf_to_khz(caps, max_perf);
policy->min/max values are overwritten, but the governor which is
supposed to use them to select the most fitting frequency will be
ignored by the firmware I think.
> + } else {
> + cpufreq_verify_within_limits(policy, min_freq, max_freq);
> + }
> +
> + cpufreq_cpu_put(cpu_policy);
> return 0;
> }
>
Powered by blists - more mailing lists