lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <019bbcd9-7bbc-45bb-9c05-f59a4c90c26e@nvidia.com>
Date: Tue, 9 Dec 2025 22:08:19 +0530
From: Sumit Gupta <sumitg@...dia.com>
To: Pierre Gondois <pierre.gondois@....com>
Cc: linux-kernel@...r.kernel.org, acpica-devel@...ts.linux.dev,
 linux-doc@...r.kernel.org, linux-acpi@...r.kernel.org,
 linux-pm@...r.kernel.org, zhanjie9@...ilicon.com, ionela.voinescu@....com,
 perry.yuan@....com, mario.limonciello@....com, gautham.shenoy@....com,
 rdunlap@...radead.org, zhenglifeng1@...wei.com, corbet@....net,
 robert.moore@...el.com, lenb@...nel.org, viresh.kumar@...aro.org,
 linux-tegra@...r.kernel.org, treding@...dia.com, jonathanh@...dia.com,
 vsethi@...dia.com, ksitaraman@...dia.com, sanjayc@...dia.com,
 nhartman@...dia.com, bbasu@...dia.com, rafael@...nel.org, ray.huang@....com,
 sumitg@...dia.com
Subject: Re: [PATCH v4 4/8] ACPI: CPPC: add APIs and sysfs interface for
 min/max_perf


On 27/11/25 20:24, Pierre Gondois wrote:
> External email: Use caution opening links or attachments
>
>
> On 11/5/25 12:38, Sumit Gupta wrote:
>> CPPC allows platforms to specify minimum and maximum performance
>> limits that constrain the operating range for CPU performance scaling
>> when Autonomous Selection is enabled. These limits can be dynamically
>> adjusted to implement power management policies or workload-specific
>> optimizations.
>>
>> Add cppc_get_min_perf() and cppc_set_min_perf() functions to read and
>> write the MIN_PERF register, allowing dynamic adjustment of the minimum
>> performance floor.
>>
>> Add cppc_get_max_perf() and cppc_set_max_perf() functions to read and
>> write the MAX_PERF register, enabling dynamic ceiling control for
>> maximum performance.
>>
>> Expose these capabilities through cpufreq sysfs attributes that accept
>> frequency values in kHz (which are converted to/from performance values
>> internally):
>> - /sys/.../cpufreq/policy*/min_perf: Read/write min perf as freq (kHz)
>> - /sys/.../cpufreq/policy*/max_perf: Read/write max perf as freq (kHz)
>>
>> The frequency-based interface provides a user-friendly abstraction which
>> is similar to other cpufreq sysfs interfaces, while the driver handles
>> conversion to hardware performance values.
>>
>> Also update EPP constants for better clarity:
>> - Rename CPPC_ENERGY_PERF_MAX to CPPC_EPP_ENERGY_EFFICIENCY_PREF
>> - Add CPPC_EPP_PERFORMANCE_PREF for the performance-oriented setting
>>
>> Signed-off-by: Sumit Gupta<sumitg@...dia.com>
>> ---
>>   drivers/acpi/cppc_acpi.c       |  55 ++++++++++-
>>   drivers/cpufreq/cppc_cpufreq.c | 166 +++++++++++++++++++++++++++++++++
>>   include/acpi/cppc_acpi.h       |  23 ++++-
>>   3 files changed, 242 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
>> index 757e8ce87e9b..ef53eb8a1feb 100644
>> --- a/drivers/acpi/cppc_acpi.c
>> +++ b/drivers/acpi/cppc_acpi.c
>> @@ -1634,7 +1634,7 @@ EXPORT_SYMBOL_GPL(cppc_set_epp_perf);
>>    */
>>   int cppc_set_epp(int cpu, u64 epp_val)
>>   {
>> -     if (epp_val > CPPC_ENERGY_PERF_MAX)
>> +     if (epp_val > CPPC_EPP_ENERGY_EFFICIENCY_PREF)
>>               return -EINVAL;
>>
>>       return cppc_set_reg_val(cpu, ENERGY_PERF, epp_val);
>> @@ -1757,6 +1757,59 @@ int cppc_set_enable(int cpu, bool enable)
>>       return cppc_set_reg_val(cpu, ENABLE, enable);
>>   }
>>   EXPORT_SYMBOL_GPL(cppc_set_enable);
>> +
>> +/**
>> + * cppc_get_min_perf - Get the min performance register value.
>> + * @cpu: CPU from which to get min performance.
>> + * @min_perf: Return address.
>> + *
>> + * Return: 0 for success, -EIO on register access failure, 
>> -EOPNOTSUPP if not supported.
>> + */
>> +int cppc_get_min_perf(int cpu, u64 *min_perf)
>> +{
>> +     return cppc_get_reg_val(cpu, MIN_PERF, min_perf);
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_get_min_perf);
>> +
>> +/**
>> + * cppc_set_min_perf() - Write the min performance register.
>> + * @cpu: CPU on which to write register.
>> + * @min_perf: Value to write to the MIN_PERF register.
>> + *
>> + * Return: 0 for success, -EIO otherwise.
>> + */
>> +int cppc_set_min_perf(int cpu, u64 min_perf)
>> +{
>> +     return cppc_set_reg_val(cpu, MIN_PERF, min_perf);
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_set_min_perf);
>> +
>> +/**
>> + * cppc_get_max_perf - Get the max performance register value.
>> + * @cpu: CPU from which to get max performance.
>> + * @max_perf: Return address.
>> + *
>> + * Return: 0 for success, -EIO on register access failure, 
>> -EOPNOTSUPP if not supported.
>> + */
>> +int cppc_get_max_perf(int cpu, u64 *max_perf)
>> +{
>> +     return cppc_get_reg_val(cpu, MAX_PERF, max_perf);
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_get_max_perf);
>> +
>> +/**
>> + * cppc_set_max_perf() - Write the max performance register.
>> + * @cpu: CPU on which to write register.
>> + * @max_perf: Value to write to the MAX_PERF register.
>> + *
>> + * Return: 0 for success, -EIO otherwise.
>> + */
>> +int cppc_set_max_perf(int cpu, u64 max_perf)
>> +{
>> +     return cppc_set_reg_val(cpu, MAX_PERF, max_perf);
>> +}
>> +EXPORT_SYMBOL_GPL(cppc_set_max_perf);
>> +
>>   /**
>>    * cppc_get_perf - Get a CPU's performance controls.
>>    * @cpu: CPU for which to get performance controls.
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c 
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index cf3ed6489a4f..cde6202e9c51 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -23,10 +23,12 @@
>>   #include <uapi/linux/sched/types.h>
>>
>>   #include <linux/unaligned.h>
>> +#include <linux/cleanup.h>
>>
>>   #include <acpi/cppc_acpi.h>
>>
>>   static struct cpufreq_driver cppc_cpufreq_driver;
>> +static DEFINE_MUTEX(cppc_cpufreq_update_autosel_config_lock);
>>
>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>>   static enum {
>> @@ -582,6 +584,68 @@ static void cppc_cpufreq_put_cpu_data(struct 
>> cpufreq_policy *policy)
>>       policy->driver_data = NULL;
>>   }
>>
>> +/**
>> + * cppc_cpufreq_set_mperf_limit - Generic function to set min/max 
>> performance limit
>> + * @policy: cpufreq policy
>> + * @val: performance value to set
>> + * @update_reg: whether to update hardware register
>
> I m not sure I see in which case we might not want to update the
> hardware register.
> Isn't the min/max_perf values relevant even when autonomous selection is
> disabled/absent ?
>

Explained in reply on 'patch 7/8'. Adding here also brief info.
When disabling auto_sel, only the policy limits are reset, the
min/max_perf registers are preserved.
When re-enabled, these preserved values are restored to both
hardware reg and policy.

>
>> + * @update_policy: whether to update policy constraints
>> + * @is_min: true for min_perf, false for max_perf
>> + */
>> +static int cppc_cpufreq_set_mperf_limit(struct cpufreq_policy 
>> *policy, u64 val,
>> +                                     bool update_reg, bool 
>> update_policy, bool is_min)
>> +{
>> +     struct cppc_cpudata *cpu_data = policy->driver_data;
>> +     struct cppc_perf_caps *caps = &cpu_data->perf_caps;
>> +     unsigned int cpu = policy->cpu;
>> +     struct freq_qos_request *req;
>> +     unsigned int freq;
>> +     u32 perf;
>> +     int ret;
>> +
>> +     perf = clamp(val, caps->lowest_perf, caps->highest_perf);
>> +     freq = cppc_perf_to_khz(caps, perf);
>> +
>> +     pr_debug("cpu%d, %s_perf:%llu, update_reg:%d, 
>> update_policy:%d\n", cpu,
>> +              is_min ? "min" : "max", (u64)perf, update_reg, 
>> update_policy);
>> +
>> + guard(mutex)(&cppc_cpufreq_update_autosel_config_lock);
>> +
>> +     if (update_reg) {
>> +             ret = is_min ? cppc_set_min_perf(cpu, perf) : 
>> cppc_set_max_perf(cpu, perf);
>> +             if (ret) {
>> +                     if (ret != -EOPNOTSUPP)
>> +                             pr_warn("Failed to set %s_perf (%llu) 
>> on CPU%d (%d)\n",
>> +                                     is_min ? "min" : "max", 
>> (u64)perf, cpu, ret);
>> +                     return ret;
>> +             }
>> +
>> +             if (is_min)
>> +                     cpu_data->perf_ctrls.min_perf = perf;
>> +             else
>> +                     cpu_data->perf_ctrls.max_perf = perf;
>> +     }
>> +
>> +     if (update_policy) {
>> +             req = is_min ? policy->min_freq_req : 
>> policy->max_freq_req;
>> +
>> +             ret = freq_qos_update_request(req, freq);
>
> IIUC, we are adding a qos constraint to the min_freq_req or
> max_freq_req. However these constraints should match the
> scaling_min/max_freq sysfs interface. So doesn't it mean that if we set
> the 'max_perf', we are overwriting the the max_freq_req constraint ?
>
Yes.

> If you have frequencies between 600000:1200000 # Init state:
> max_perf:1200000 scaling_max_freq:1200000 # echo 10000000 > max_perf
> max_perf:1000000 scaling_max_freq:1000000 # echo 900000 >
> scaling_max_freq max_perf:1000000 scaling_max_freq:900000 # echo 1200000
> > scaling_max_freq max_perf:1000000 scaling_max_freq:1200000
>
> The 2 values are not in sync. Is it the desired behaviour ?
>
>

Making scaling_min/max_freq read-only in auto_sel mode will solve this.
We can do this by setting policy limits to min/max_perf bounds in
cppc_verify_policy() when the auto_sel is enabled.
In autonomous mode, the hardware controls performance within these
bounds, so scaling_min/max_freq is effectively read-only.
Users must use min_perf/max_perf sysfs to change limits.
Please share if you have different thoughts or another approach.

  cppc_verify_policy(struct cpufreq_policy_data *policy_data)
  {
     ...
     if (caps->auto_sel) {
       min_perf = cpu_data->perf_ctrls.min_perf ?: 
caps->lowest_nonlinear_perf;
       max_perf = cpu_data->perf_ctrls.max_perf ?: caps->nominal_perf;

       /* set min/max_perf bounds (read-only behavior) */
       policy_data->min = cppc_perf_to_khz(caps, min_perf);
       policy_data->max = cppc_perf_to_khz(caps, max_perf);
     } else {
       cpufreq_verify_within_limits(policy_data, min_freq, max_freq);
     }
     ....
  }


>> +             if (ret < 0) {
>> +                     pr_warn("Failed to update %s_freq constraint 
>> for CPU%d: %d\n",
>> +                             is_min ? "min" : "max", cpu, ret);
>> +                     return ret;
>> +             }
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +#define cppc_cpufreq_set_min_perf(policy, val, update_reg, 
>> update_policy) \
>> +     cppc_cpufreq_set_mperf_limit(policy, val, update_reg, 
>> update_policy, true)
>> +
>> +#define cppc_cpufreq_set_max_perf(policy, val, update_reg, 
>> update_policy) \
>> +     cppc_cpufreq_set_mperf_limit(policy, val, update_reg, 
>> update_policy, false)
>> +
>>   static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>   {
>>       unsigned int cpu = policy->cpu;
>> @@ -881,16 +945,118 @@ static ssize_t 
>> store_energy_performance_preference_val(struct cpufreq_policy *po
>>       return cppc_cpufreq_sysfs_store_u64(policy->cpu, cppc_set_epp, 
>> buf, count);
>>   }
>>
>> +/**
>> + * show_min_perf - Show minimum performance as frequency (kHz)
>> + *
>> + * Reads the MIN_PERF register and converts the performance value to
>> + * frequency (kHz) for user-space consumption.
>> + */
>> +static ssize_t show_min_perf(struct cpufreq_policy *policy, char *buf)
>> +{
>> +     struct cppc_cpudata *cpu_data = policy->driver_data;
>> +     u64 perf;
>> +     int ret;
>> +
>> +     ret = cppc_get_min_perf(policy->cpu, &perf);
>> +     if (ret == -EOPNOTSUPP)
>> +             return sysfs_emit(buf, "<unsupported>\n");
>> +     if (ret)
>> +             return ret;
>> +
>> +     /* Convert performance to frequency (kHz) for user */
>> +     return sysfs_emit(buf, "%u\n", 
>> cppc_perf_to_khz(&cpu_data->perf_caps, perf));
>> +}
>> +
>> +/**
>> + * store_min_perf - Set minimum performance from frequency (kHz)
>> + *
>> + * Converts the user-provided frequency (kHz) to a performance value
>> + * and writes it to the MIN_PERF register.
>> + */
>> +static ssize_t store_min_perf(struct cpufreq_policy *policy, const 
>> char *buf, size_t count)
>> +{
>> +     struct cppc_cpudata *cpu_data = policy->driver_data;
>> +     unsigned int freq_khz;
>> +     u64 perf;
>> +     int ret;
>> +
>> +     ret = kstrtouint(buf, 0, &freq_khz);
>> +     if (ret)
>> +             return ret;
>> +
>> +     /* Convert frequency (kHz) to performance value */
>> +     perf = cppc_khz_to_perf(&cpu_data->perf_caps, freq_khz);
>> +
>> +     ret = cppc_cpufreq_set_min_perf(policy, perf, true, 
>> cpu_data->perf_caps.auto_sel);
>> +     if (ret)
>> +             return ret;
>> +
>> +     return count;
>> +}
>> +
>> +/**
>> + * show_max_perf - Show maximum performance as frequency (kHz)
>> + *
>> + * Reads the MAX_PERF register and converts the performance value to
>> + * frequency (kHz) for user-space consumption.
>> + */
>> +static ssize_t show_max_perf(struct cpufreq_policy *policy, char *buf)
>
> I think it might collide with the scaling_min/max_freq.
> I saw that you answered this point at:
> https://lore.kernel.org/lkml/b2bd3258-51bd-462a-ae29-71f1d6f823f3@nvidia.com/ 
>
>
> But I m not sure I understood why it is needed to have 2 interfaces.
> Would it be possible to explain it again ?

Separate interface for min/max_perf are kept because we are writing
to different CPPC hardware registers with that name.

>
> I don't see any case where we would like to make a distinction between:
> - scaling_max_freq, i.e. the maximal freq. the cpufreq driver is allowed
> to set
> - max_perf, i.e. the maximal perf. level the firmware will set
>
> ------------
>
> Another point is that the min/max_perf interface actually uses freq. 
> values.

Changed the min/max_perf interfaces from perf to freq to sync their scale
with other cpufreq sysfs interfaces after discussion in [1].

  [1] 
https://lore.kernel.org/lkml/80e16de0-63e4-4ead-9577-4ebba9b1a02d@nvidia.com/

Thank you,
Sumit Gupta



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ