[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1a2059cf-b12c-4e47-b52e-e5c797a0506c@huawei.com>
Date: Sat, 20 Dec 2025 18:29:01 +0800
From: "zhenglifeng (A)" <zhenglifeng1@...wei.com>
To: Pierre Gondois <pierre.gondois@....com>
CC: <linux-kernel@...r.kernel.org>, Christian Loehle
<christian.loehle@....com>, Ionela Voinescu <ionela.voinescu@....com>, Jie
Zhan <zhanjie9@...ilicon.com>, Huang Rui <ray.huang@....com>, "Gautham R.
Shenoy" <gautham.shenoy@....com>, Mario Limonciello
<mario.limonciello@....com>, Perry Yuan <perry.yuan@....com>, "Rafael J.
Wysocki" <rafael@...nel.org>, Viresh Kumar <viresh.kumar@...aro.org>,
<linux-pm@...r.kernel.org>
Subject: Re: [PATCH v2 3/3] cpufreq: Update set_boost callbacks to rely on
boost_freq_req
On 2025/12/18 0:22, Pierre Gondois wrote:
> Hello Lifeng,
>
> Thanks for the review.
> I wrote a bit of text, but IIUC you already agree with what I describe.
> This might be more to be sure of what I want to do.
>
> If you disagree with something, please let me know.
>
> On 12/10/25 10:26, zhenglifeng (A) wrote:
>> On 2025/12/8 18:59, Pierre Gondois wrote:
>>> In the existing set_boost() callbacks:
>>> - Don't update policy->max as this is done through the qos notifier
>>> cpufreq_notifier_max() which calls cpufreq_set_policy().
>>> - Remove freq_qos_update_request() calls as the qos request is now
>>> done in policy_set_boost() and updates the new boost_freq_req
>>>
>>> Signed-off-by: Pierre Gondois <pierre.gondois@....com>
>>> ---
>>> drivers/cpufreq/amd-pstate.c | 2 --
>>> drivers/cpufreq/cppc_cpufreq.c | 21 ++++-----------------
>>> drivers/cpufreq/cpufreq.c | 14 ++------------
>>> 3 files changed, 6 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>>> index b44f0f7a5ba1c..50416358a96ac 100644
>>> --- a/drivers/cpufreq/amd-pstate.c
>>> +++ b/drivers/cpufreq/amd-pstate.c
>>> @@ -754,8 +754,6 @@ static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
>>> else if (policy->cpuinfo.max_freq > nominal_freq)
>>> policy->cpuinfo.max_freq = nominal_freq;
>>> - policy->max = policy->cpuinfo.max_freq;
>>> -
>>> if (cppc_state == AMD_PSTATE_PASSIVE) {
>>> ret = freq_qos_update_request(&cpudata->req[1], policy->cpuinfo.max_freq);
>>> if (ret < 0)
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>>> index e23d9abea1359..3baf7baaec371 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -597,21 +597,14 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>> caps = &cpu_data->perf_caps;
>>> policy->driver_data = cpu_data;
>>> - /*
>>> - * Set min to lowest nonlinear perf to avoid any efficiency penalty (see
>>> - * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>>> - */
>>> - policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>>> - policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>>> - caps->highest_perf : caps->nominal_perf);
>> Why remove this?
>
> This is partly a mistake.
> As you suggested below (I think), policy->max should not be set directly.
> It might be better to set the boost_freq_req for all cpufreq drivers, which should
> result in setting policy->max.
>
>>
>>> -
>>> /*
>>> * Set cpuinfo.min_freq to Lowest to make the full range of performance
>>> * available if userspace wants to use any perf between lowest & lowest
>>> * nonlinear perf
>>> */
>>> policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
>>> - policy->cpuinfo.max_freq = policy->max;
>>> + policy->cpuinfo.max_freq = cppc_perf_to_khz(caps, policy->boost_enabled ?
>>> + caps->highest_perf : caps->nominal_perf);
>>> policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us(cpu);
>>> policy->shared_type = cpu_data->shared_type;
>>> @@ -776,17 +769,11 @@ static int cppc_cpufreq_set_boost(struct cpufreq_policy *policy, int state)
>>> {
>>> struct cppc_cpudata *cpu_data = policy->driver_data;
>>> struct cppc_perf_caps *caps = &cpu_data->perf_caps;
>>> - int ret;
>>> if (state)
>>> - policy->max = cppc_perf_to_khz(caps, caps->highest_perf);
>>> + policy->cpuinfo.max_freq = cppc_perf_to_khz(caps, caps->highest_perf);
>>> else
>>> - policy->max = cppc_perf_to_khz(caps, caps->nominal_perf);
>>> - policy->cpuinfo.max_freq = policy->max;
>>> -
>>> - ret = freq_qos_update_request(policy->max_freq_req, policy->max);
>>> - if (ret < 0)
>>> - return ret;
>>> + policy->cpuinfo.max_freq = cppc_perf_to_khz(caps, caps->nominal_perf);
>>> return 0;
>>> }
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index 65ef0fa70c388..ab2def9e4d188 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -1514,10 +1514,6 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>>> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>>> CPUFREQ_CREATE_POLICY, policy);
>>> - } else {
>>> - ret = freq_qos_update_request(policy->max_freq_req, policy->max);
>>> - if (ret < 0)
>>> - goto out_destroy_policy;
>> I think boost_freq_req should be updated here, to solve the problem that
>> this code originally intended to solve.
>
> Yes right indeed.
>
>>> }
>>> if (cpufreq_driver->get && has_target()) {
>>> @@ -2819,16 +2815,10 @@ int cpufreq_boost_set_sw(struct cpufreq_policy *policy, int state)
>>> return -ENXIO;
>>> ret = cpufreq_frequency_table_cpuinfo(policy);
>> cpufreq_frequency_table_cpuinfo() may change policy->max. I believe this
>> isn't what you want.
>
>
> cpufreq_frequency_table_cpuinfo() can effectively update
> policy->cpuinfo.max_freq, but directly setting policy->max should be wrong
> as it bypasses the other QoS constraints on the maximal frequency.
>
> Updates to policy->max should go through the following call chain
> to be sure all constraints/notifiers are respected/called.
> freq_qos_update_request()
> \-freq_qos_apply()
> \-pm_qos_update_target()
> \-blocking_notifier_call_chain()
> \-cpufreq_notifier_max()
> \-handle_update()
> \-refresh_frequency_limits()
> \-cpufreq_set_policy()
>
> FYIU, we should have:
> - max_freq_req: the maximal frequency constraint as set by the user.
> It is updated whenever the user write to scaling_max_freq.
> - boost_freq_req: the maximal frequency constraint as set by the
> driver. It is updated whenever boost is enabled/disabled.
> - policy->cpuinfo.max_freq: the maximal frequency reachable by the driver.
> This value is used in cpufreq at various places to check frequencies
> are within valid boundaries.
> - policy->max: the maximal frequency cpufreq can use. It is a resultant
> of all the QoS constraints received (from the user, boost, thermal).
> It should be updated whenever one of the QoS constraint is updated.
> It should never be set directly to avoid bypassing the QoS constraints.
>
> Whenever a cpufreq driver is initialized, policy->max is set, but the
> value is overridden whenever the user writes to scaling_max_freq.
> Thus we might think it should be replaced with a max_freq_req constraint.
>
> However if boost is enabled, the maximal frequency will be limited by
> max_freq_req. So at init, cpufreq drivers should set boost_freq_req
> instead (to policy->cpuinfo.max_freql).
> That way, if boost is enabled, the maximal frequency available is the
> boost frequency.
>
> ------
>
> Summary:
> -
> policy->max should never be set directly. It should only be set through
> cpufreq_set_policy(). cpufreq_set_policy() might be called indirectly
> after updating a QoS constraint using freq_qos_update_request().
>
> -
> boost_freq_req should be set for all cpufreq drivers, with a default value
> of policy->cpuinfo.max_freq. This represents the maximal frequency available
> with/without boost.
> Note: the name "boost_freq_req" might not be well chosen.
>
> -
> Any update to policy->cpuinfo.max_freq should be followed by a call to
> freq_qos_update_request(policy->boost_freq_req).
> This will allow to update "policy->max" with the new boost frequency.
Yes. I agree. So the source of the problem is that max_freq_req includes
both user-defined limits and driver-defined limits. And now you try to
separate them. That's nice. Looking forward to the next version!
At the same time, I'm curious whether a similar problem would occur with
min_freq_req if a driver existed that could change cpuinfo.min_freq in
runtime (not quite certain whether such a driver exists).
>
>
>>> - if (ret) {
>>> + if (ret)
>>> pr_err("%s: Policy frequency update failed\n", __func__);
>>> - return ret;
>>> - }
>>> - ret = freq_qos_update_request(policy->max_freq_req, policy->max);
>>> - if (ret < 0)
>>> - return ret;
>>> -
>>> - return 0;
>>> + return ret;
>>> }
>>> EXPORT_SYMBOL_GPL(cpufreq_boost_set_sw);
>>>
>
Powered by blists - more mailing lists