[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0ce0acbc-99da-925e-145d-3c80558be761@hisilicon.com>
Date: Mon, 4 Aug 2025 14:31:16 +0800
From: Jie Zhan <zhanjie9@...ilicon.com>
To: Beata Michalska <beata.michalska@....com>
CC: Bowen Yu <yubowen8@...wei.com>, <rafael@...nel.org>,
<viresh.kumar@...aro.org>, <linux-pm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linuxarm@...wei.com>,
<jonathan.cameron@...wei.com>, <lihuisong@...wei.com>,
<zhenglifeng1@...wei.com>
Subject: Re: [PATCH 2/2] cpufreq: CPPC: Fix error handling in
cppc_scale_freq_workfn()
On 31/07/2025 17:42, Beata Michalska wrote:
> On Thu, Jul 31, 2025 at 04:52:05PM +0800, Jie Zhan wrote:
>>
>>
>> On 31/07/2025 16:19, Beata Michalska wrote:
>>> Hi Bowen, Jie
>>> On Wed, Jul 30, 2025 at 11:23:12AM +0800, Bowen Yu wrote:
>>>> From: Jie Zhan <zhanjie9@...ilicon.com>
>>>>
>>>> Perf counters could be 0 if the cpu is in a low-power idle state. Just try
>>>> it again next time and update the frequency scale when the cpu is active
>>>> and perf counters successfully return.
>>>>
>>>> Also, remove the FIE source on an actual failure.
>>>>
>>>> Signed-off-by: Jie Zhan <zhanjie9@...ilicon.com>
>>>> ---
>>>> drivers/cpufreq/cppc_cpufreq.c | 13 ++++++++++++-
>>>> 1 file changed, 12 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>>>> index 904006027df2..e95844d3d366 100644
>>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>>> @@ -78,12 +78,23 @@ static void cppc_scale_freq_workfn(struct kthread_work *work)
>>>> struct cppc_cpudata *cpu_data;
>>>> unsigned long local_freq_scale;
>>>> u64 perf;
>>>> + int ret;
>>>>
>>>> cppc_fi = container_of(work, struct cppc_freq_invariance, work);
>>>> cpu_data = cppc_fi->cpu_data;
>>>>
>>>> - if (cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs)) {
>>>> + ret = cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs);
>>>> + /*
>>>> + * Perf counters could be 0 if the cpu is in a low-power idle state.
>>>> + * Just try it again next time.
>>>> + */
>>>> + if (ret == -EFAULT)
>>>> + return;
>>> Which counters are we actually talking about here ?
>>
>> Delivered performance counter and reference performance counter.
>> They are actually AMU CPU_CYCLES and CNT_CYCLES event counters.
> That does track then.
>>
>>>> +
>>>> + if (ret) {
>>>> pr_warn("%s: failed to read perf counters\n", __func__);
>>>> + topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC,
>>>> + cpu_data->shared_cpu_map);
>>>> return;
>>>> }
>>> And the real error here would be ... ?
>>> That makes me wonder why this has been registered as the source of the freq
>>> scale in the first place if we are to hit some serious issue. Would you be able
>>> to give an example of any?
>> If it gets here, that would be -ENODEV or -EIO from cppc_get_perf_ctrs(),
>> which could possibly come from data corruption (no CPC descriptor) or a PCC
>> failure.
>>
>> I can't easily fake an error here, but the above -EFAULT path could
>> happen when it luckily passes the FIE init.
>>
> The change seems reasonable. Though I am wondering if some other errors might be
> rather transient as well ? Like -EIO ?
> Note, I'm not an expert here.
The -EIO from PCC contains much more error cases than this.
Powered by blists - more mailing lists