lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f3849033-e883-4296-abb7-eb04e8c2a03c@amd.com>
Date: Fri, 7 Mar 2025 22:30:25 -0600
From: Mario Limonciello <mario.limonciello@....com>
To: Mario Limonciello <superm1@...nel.org>,
 "Gautham R. Shenoy" <gautham.shenoy@....com>
Cc: Perry Yuan <perry.yuan@....com>,
 Dhananjay Ugwekar <Dhananjay.Ugwekar@....com>,
 "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
 <linux-kernel@...r.kernel.org>,
 "open list:CPU FREQUENCY SCALING FRAMEWORK" <linux-pm@...r.kernel.org>
Subject: Re: [PATCH v2 3/5] cpufreq/amd-pstate: Add support for platform
 profile class

On 3/7/2025 10:55, Mario Limonciello wrote:
> On 3/7/2025 10:22, Gautham R. Shenoy wrote:
>> On Tue, Mar 04, 2025 at 09:23:25AM -0600, Mario Limonciello wrote:
>>> From: Mario Limonciello <mario.limonciello@....com>
>>>
>>> The platform profile core allows multiple drivers and devices to
>>> register platform profile support.
>>>
>>> When the legacy platform profile interface is used all drivers will
>>> adjust the platform profile as well.
>>>
>>> Add support for registering every CPU with the platform profile handler
>>> when dynamic EPP is enabled.
>>>
>>> The end result will be that changing the platform profile will modify
>>> EPP accordingly.
>>>
>>> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
>>> ---
>>>   Documentation/admin-guide/pm/amd-pstate.rst |   4 +-
>>>   drivers/cpufreq/Kconfig.x86                 |   1 +
>>>   drivers/cpufreq/amd-pstate.c                | 142 +++++++++++++++++---
>>>   drivers/cpufreq/amd-pstate.h                |  10 ++
>>>   4 files changed, 140 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/ 
>>> Documentation/admin-guide/pm/amd-pstate.rst
>>> index 8424e7119dd7e..36950fb6568c0 100644
>>> --- a/Documentation/admin-guide/pm/amd-pstate.rst
>>> +++ b/Documentation/admin-guide/pm/amd-pstate.rst
>>> @@ -321,7 +321,9 @@ Whether this behavior is enabled by default with 
>>> the kernel config option
>>>   at runtime by the sysfs file ``/sys/devices/system/cpu/cpufreq/ 
>>> policyX/dynamic_epp``.
>>>   When set to enabled, the driver will select a different energy 
>>> performance
>>> -profile when the machine is running on battery or AC power.
>>> +profile when the machine is running on battery or AC power. The 
>>> driver will
>>> +also register with the platform profile handler to receive 
>>> notifications of
>>> +user desired power state and react to those.
>>>   When set to disabled, the driver will not change the energy 
>>> performance profile
>>>   based on the power source and will not react to user desired power 
>>> state.
>>> diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
>>> index c5ef92634ddf4..905eab998b836 100644
>>> --- a/drivers/cpufreq/Kconfig.x86
>>> +++ b/drivers/cpufreq/Kconfig.x86
>>> @@ -40,6 +40,7 @@ config X86_AMD_PSTATE
>>>       select ACPI_PROCESSOR
>>>       select ACPI_CPPC_LIB if X86_64
>>>       select CPU_FREQ_GOV_SCHEDUTIL if SMP
>>> +    select ACPI_PLATFORM_PROFILE
>>>       help
>>>         This driver adds a CPUFreq driver which utilizes a fine grain
>>>         processor performance frequency control range instead of legacy
>>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>>> index 9911808fe0bcf..28c02edf6e40b 100644
>>> --- a/drivers/cpufreq/amd-pstate.c
>>> +++ b/drivers/cpufreq/amd-pstate.c
>>> @@ -105,6 +105,7 @@ static struct quirk_entry *quirks;
>>>    *    2        balance_performance
>>>    *    3        balance_power
>>>    *    4        power
>>> + *    5        custom (for raw EPP values)
>>>    */
>>>   enum energy_perf_value_index {
>>>       EPP_INDEX_DEFAULT = 0,
>>> @@ -112,6 +113,7 @@ enum energy_perf_value_index {
>>>       EPP_INDEX_BALANCE_PERFORMANCE,
>>>       EPP_INDEX_BALANCE_POWERSAVE,
>>>       EPP_INDEX_POWERSAVE,
>>> +    EPP_INDEX_CUSTOM,
>>>   };
>>>   static const char * const energy_perf_strings[] = {
>>> @@ -120,6 +122,7 @@ static const char * const energy_perf_strings[] = {
>>>       [EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
>>>       [EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
>>>       [EPP_INDEX_POWERSAVE] = "power",
>>> +    [EPP_INDEX_CUSTOM] = "custom",
>>>       NULL
>>>   };
>>> @@ -1073,6 +1076,10 @@ static int 
>>> amd_pstate_power_supply_notifier(struct notifier_block *nb,
>>>       if (event != PSY_EVENT_PROP_CHANGED)
>>>           return NOTIFY_OK;
>>> +    /* dynamic actions are only applied while platform profile is in 
>>> balanced */
>>> +    if (cpudata->current_profile != PLATFORM_PROFILE_BALANCED)
>>> +        return 0;
>>> +
>>>       epp = amd_pstate_get_balanced_epp(policy);
>>>       ret = amd_pstate_set_epp(policy, epp);
>>> @@ -1081,14 +1088,84 @@ static int 
>>> amd_pstate_power_supply_notifier(struct notifier_block *nb,
>>>       return NOTIFY_OK;
>>>   }
>>> -static void amd_pstate_clear_dynamic_epp(struct cpufreq_policy *policy)
>>> +
>>> +static int amd_pstate_profile_probe(void *drvdata, unsigned long 
>>> *choices)
>>> +{
>>> +    set_bit(PLATFORM_PROFILE_LOW_POWER, choices);
>>> +    set_bit(PLATFORM_PROFILE_BALANCED, choices);
>>> +    set_bit(PLATFORM_PROFILE_PERFORMANCE, choices);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int amd_pstate_profile_get(struct device *dev,
>>> +                  enum platform_profile_option *profile)
>>> +{
>>> +    struct amd_cpudata *cpudata = dev_get_drvdata(dev);
>>> +
>>> +    *profile = cpudata->current_profile;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int amd_pstate_profile_set(struct device *dev,
>>> +                  enum platform_profile_option profile)
>>> +{
>>> +    struct amd_cpudata *cpudata = dev_get_drvdata(dev);
>>> +    struct cpufreq_policy *policy __free(put_cpufreq_policy) = 
>>> cpufreq_cpu_get(cpudata->cpu);
>>> +    int ret;
>>> +
>>> +    switch (profile) {
>>> +    case PLATFORM_PROFILE_LOW_POWER:
>>> +        if (cpudata->policy != CPUFREQ_POLICY_POWERSAVE)
>>> +            cpudata->policy = CPUFREQ_POLICY_POWERSAVE;
>>
>> So prior to the patch, cpudata->policy is supposed to mirror
>> policy->policy.  With this patch, this assumption is no longer
>> true. So it is possible for the user to again override the choice of
>> EPP set via platform profile by changing the cpufreq governor ?
>>
>> Is this the expected behaviour?
>>
>> The bigger concern is, if the governor was previously "performance"
>> and then the platform profile requested "low power", "cat
>> /sys/devices/system/cpu/cpuX/cpufreq/scaling_governor" would still
>> show "performance", which is inconsistent with the behaviour.
>>
>>
> 
> This ties back to the previous patches for dynamic EPP.  My expectation 
> was that when dynamic EPP is enabled that users can't manually set the 
> EPP anymore (it will return -EBUSY) and likewise turning on dynamic EPP
> should keep the governor as powersave.
> 
> I'll double check all those are properly enforced; but that's at least 
> the intent.

FWIW - I double checked and confirmed that this is working as intended.
* I couldn't change from powersave to performance when dynamic_epp was 
enabled (-EBUSY)
* I couldn't change energy_performance_preference when dynamic_epp was 
enabled (-EBUSY)

> 
> IMO this "should" all work because turning on Dynamic EPP sysfs file 
> forces the driver to go through a state transition that it will tear 
> everything down and back up.  The policy will come back up in 
> "powersave" even if it was previously in "performance" when the dynamic 
> EPP sysfs file was turned on.
> 
> Longer term; I also envision the scheduler influencing EPP values when 
> dynamic_epp is turned on.  The "platform profile" would be an "input" to 
> that decision making process (maybe giving a weighting?), not the only 
> lever.
> 
> I haven't given any serious look at how to do this with the scheduler, I 
> wanted to lay the foundation first being dynamic EPP and raw EPP.
> 
> So even if dynamic_epp isn't interesting "right now" for server because 
> the focus is around behavior for AC/DC, don't write it off just yet.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ