[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4c76cbfc-2e7e-4937-a838-09c49440d3ad@arm.com>
Date: Mon, 26 Jan 2026 12:23:01 +0100
From: Pierre Gondois <pierre.gondois@....com>
To: Sumit Gupta <sumitg@...dia.com>, rafael@...nel.org,
viresh.kumar@...aro.org, zhenglifeng1@...wei.com, ionela.voinescu@....com,
lenb@...nel.org, robert.moore@...el.com, corbet@....net,
rdunlap@...radead.org, ray.huang@....com, gautham.shenoy@....com,
mario.limonciello@....com, perry.yuan@....com, zhanjie9@...ilicon.com,
linux-pm@...r.kernel.org, linux-acpi@...r.kernel.org,
linux-doc@...r.kernel.org, acpica-devel@...ts.linux.dev,
linux-kernel@...r.kernel.org
Cc: linux-tegra@...r.kernel.org, treding@...dia.com, jonathanh@...dia.com,
vsethi@...dia.com, ksitaraman@...dia.com, sanjayc@...dia.com,
nhartman@...dia.com, bbasu@...dia.com
Subject: Re: [PATCH v6 7/9] ACPI: CPPC: add APIs and sysfs interface for
perf_limited
On 1/24/26 22:04, Sumit Gupta wrote:
>
> On 22/01/26 17:21, Pierre Gondois wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 1/20/26 15:56, Sumit Gupta wrote:
>>> Add sysfs interface to read/write the Performance Limited register.
>>>
>>> The Performance Limited register indicates to the OS that an
>>> unpredictable event (like thermal throttling) has limited processor
>>> performance. It contains two sticky bits set by the platform:
>>> - Bit 0 (Desired_Excursion): Set when delivered performance is
>>> constrained below desired performance. Not used when Autonomous
>>> Selection is enabled.
>>> - Bit 1 (Minimum_Excursion): Set when delivered performance is
>>> constrained below minimum performance.
>>>
>>> These bits remain set until OSPM explicitly clears them. The write
>>> operation accepts a bitmask of bits to clear:
>>> - Write 0x1 to clear bit 0
>>> - Write 0x2 to clear bit 1
>>> - Write 0x3 to clear both bits
>>>
>>> This enables users to detect if platform throttling impacted a
>>> workload.
>>> Users clear the register before execution, run the workload, then check
>>> afterward - if set, hardware throttling occurred during that time
>>> window.
>>>
>>> The interface is exposed as:
>>> /sys/devices/system/cpu/cpuX/cpufreq/perf_limited
>>>
>>> Signed-off-by: Sumit Gupta <sumitg@...dia.com>
>>> ---
>>> drivers/acpi/cppc_acpi.c | 56
>>> ++++++++++++++++++++++++++++++++++
>>> drivers/cpufreq/cppc_cpufreq.c | 5 +++
>>> include/acpi/cppc_acpi.h | 15 +++++++++
>>> 3 files changed, 76 insertions(+)
>>>
>>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>>> index 46bf45f8b0f3..b46f22f58f56 100644
>>> --- a/drivers/acpi/cppc_acpi.c
>>> +++ b/drivers/acpi/cppc_acpi.c
>>> @@ -1787,6 +1787,62 @@ int cppc_set_max_perf(int cpu, u32 max_perf)
>>> }
>>> EXPORT_SYMBOL_GPL(cppc_set_max_perf);
>>>
>>> +/**
>>> + * cppc_get_perf_limited - Get the Performance Limited register value.
>>> + * @cpu: CPU from which to get Performance Limited register.
>>> + * @perf_limited: Pointer to store the Performance Limited value.
>>> + *
>>> + * The returned value contains sticky status bits indicating
>>> platform-imposed
>>> + * performance limitations.
>>> + *
>>> + * Return: 0 for success, -EIO on failure, -EOPNOTSUPP if not
>>> supported.
>>> + */
>>> +int cppc_get_perf_limited(int cpu, u64 *perf_limited)
>>> +{
>>> + return cppc_get_reg_val(cpu, PERF_LIMITED, perf_limited);
>>> +}
>>> +EXPORT_SYMBOL_GPL(cppc_get_perf_limited);
>>> +
>>> +/**
>>> + * cppc_set_perf_limited() - Clear bits in the Performance Limited
>>> register.
>>> + * @cpu: CPU on which to write register.
>>> + * @bits_to_clear: Bitmask of bits to clear in the perf_limited
>>> register.
>>> + *
>>> + * The Performance Limited register contains two sticky bits set by
>>> platform:
>>> + * - Bit 0 (Desired_Excursion): Set when delivered performance is
>>> constrained
>>> + * below desired performance. Not used when Autonomous
>>> Selection is enabled.
>>> + * - Bit 1 (Minimum_Excursion): Set when delivered performance is
>>> constrained
>>> + * below minimum performance.
>>> + *
>>> + * These bits are sticky and remain set until OSPM explicitly
>>> clears them.
>>> + * This function only allows clearing bits (the platform sets them).
>>> + *
>>> + * Return: 0 for success, -EINVAL for invalid bits, -EIO on register
>>> + * access failure, -EOPNOTSUPP if not supported.
>>> + */
>>> +int cppc_set_perf_limited(int cpu, u64 bits_to_clear)
>>> +{
>>> + u64 current_val, new_val;
>>> + int ret;
>>> +
>>> + /* Only bits 0 and 1 are valid */
>>> + if (bits_to_clear & ~CPPC_PERF_LIMITED_MASK)
>>> + return -EINVAL;
>>> +
>>> + if (!bits_to_clear)
>>> + return 0;
>>> +
>>> + ret = cppc_get_perf_limited(cpu, ¤t_val);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + /* Clear the specified bits */
>>> + new_val = current_val & ~bits_to_clear;
>>> +
>>> + return cppc_set_reg_val(cpu, PERF_LIMITED, new_val);
>>> +}
>>> +EXPORT_SYMBOL_GPL(cppc_set_perf_limited);
>>> +
>>> /**
>>> * cppc_set_enable - Set to enable CPPC on the processor by
>>> writing the
>>> * Continuous Performance Control package EnableRegister field.
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c
>>> b/drivers/cpufreq/cppc_cpufreq.c
>>> index 66e183b45fb0..afb2cdb67a2f 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -1071,12 +1071,16 @@ static ssize_t store_max_perf(struct
>>> cpufreq_policy *policy, const char *buf,
>>> return count;
>>> }
>>>
>>> +CPPC_CPUFREQ_ATTR_RW_U64(perf_limited, cppc_get_perf_limited,
>>> + cppc_set_perf_limited)
>>> +
>>> cpufreq_freq_attr_ro(freqdomain_cpus);
>>> cpufreq_freq_attr_rw(auto_select);
>>> cpufreq_freq_attr_rw(auto_act_window);
>>> cpufreq_freq_attr_rw(energy_performance_preference_val);
>>> cpufreq_freq_attr_rw(min_perf);
>>> cpufreq_freq_attr_rw(max_perf);
>>> +cpufreq_freq_attr_rw(perf_limited);
>>
>> If the OS wants to get regular feedback about whether the platform had
>> to limit
>> the perf. level, it will likely try to frequently probe the register.
>> In order to see new events, the register must be cleared. So:
>> - is it a good idea to allow users to write this register ?
>> - is it useful to expose this register if the OS frequently clears it ?
>>
>> I think the functions are useful, it might just be questionable to
>> expose
>> the register in the sysfs.
>>
>
> Currently the kernel doesn't automatically poll or clear perf_limited,
> so sysfs exposure is for manual monitoring. I can make it read-only
> but then users can only observe throttling events and can't clear
> them (though bits stay sticky). So, better to expose as RW attribute.
>
Ok right
Powered by blists - more mailing lists