lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9f5f8181-7d0e-415d-b473-0e3c6601ccc3@amd.com>
Date: Thu, 23 Jan 2025 11:05:29 -0600
From: Mario Limonciello <mario.limonciello@....com>
To: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
 "zhenglifeng (A)" <zhenglifeng1@...wei.com>,
 Pierre Gondois <pierre.gondois@....com>,
 Russell Haley <yumpusamongus@...il.com>, rafael@...nel.org, lenb@...nel.org,
 robert.moore@...el.com, viresh.kumar@...aro.org
Cc: acpica-devel@...ts.linux.dev, linux-acpi@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org, linuxarm@...wei.com,
 jonathan.cameron@...wei.com, gautham.shenoy@....com, ray.huang@....com,
 zhanjie9@...ilicon.com, lihuisong@...wei.com, hepeng68@...wei.com,
 fanghao11@...wei.com
Subject: Re: [PATCH v4 6/6] cpufreq: CPPC: Support for autonomous selection in
 cppc_cpufreq

On 1/23/2025 10:46, Srinivas Pandruvada wrote:
> 
> On 1/20/25 18:42, zhenglifeng (A) wrote:
>> On 2025/1/21 1:44, Mario Limonciello wrote:
>>
>>> On 1/20/2025 08:49, Pierre Gondois wrote:
>>>>
>>>> On 1/20/25 04:15, zhenglifeng (A) wrote:
>>>>> On 2025/1/17 22:30, Mario Limonciello wrote:
>>>>>
>>>>>> On 1/16/2025 21:11, zhenglifeng (A) wrote:
>>>>>>> On 2025/1/16 19:39, Russell Haley wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I noticed something here just as a user casually browsing the 
>>>>>>>> mailing list.
>>>>>>>>
>>>>>>>> On 1/13/25 6:21 AM, Lifeng Zheng wrote:
>>>>>>>>> Add sysfs interfaces for CPPC autonomous selection in the 
>>>>>>>>> cppc_cpufreq
>>>>>>>>> driver.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Lifeng Zheng <zhenglifeng1@...wei.com>
>>>>>>>>> ---
>>>>>>>>>     .../ABI/testing/sysfs-devices-system-cpu      |  54 +++++++++
>>>>>>>>>     drivers/cpufreq/cppc_cpufreq.c                | 109 +++++++ 
>>>>>>>>> ++++ +++++++
>>>>>>>>>     2 files changed, 163 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
>>>>>>>>> b/ Documentation/ABI/testing/sysfs-devices-system-cpu
>>>>>>>>> index 206079d3bd5b..3d87c3bb3fe2 100644
>>>>>>>>> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
>>>>>>>>> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
>>>>>>>>> @@ -268,6 +268,60 @@ Description:    Discover CPUs in the same 
>>>>>>>>> CPU frequency coordination domain
>>>>>>>>>             This file is only present if the acpi-cpufreq or 
>>>>>>>>> the cppc-cpufreq
>>>>>>>>>             drivers are in use.
>>>>>>>> [...snip...]
>>>>>>>>
>>>>>>>>> +What:        /sys/devices/system/cpu/cpuX/cpufreq/energy_perf
>>>>>>>>> +Date:        October 2024
>>>>>>>>> +Contact:    linux-pm@...r.kernel.org
>>>>>>>>> +Description:    Energy performance preference
>>>>>>>>> +
>>>>>>>>> +        Read/write an 8-bit integer from/to this file. This file
>>>>>>>>> +        represents a range of values from 0 (performance 
>>>>>>>>> preference) to
>>>>>>>>> +        0xFF (energy efficiency preference) that influences 
>>>>>>>>> the rate of
>>>>>>>>> +        performance increase/decrease and the result of the 
>>>>>>>>> hardware's
>>>>>>>>> +        energy efficiency and performance optimization policies.
>>>>>>>>> +
>>>>>>>>> +        Writing to this file only has meaning when Autonomous 
>>>>>>>>> Selection is
>>>>>>>>> +        enabled.
>>>>>>>>> +
>>>>>>>>> +        This file only presents if the cppc-cpufreq driver is 
>>>>>>>>> in use.
>>>>>>>> In intel_pstate driver, there is file with near-identical 
>>>>>>>> semantics:
>>>>>>>>
>>>>>>>> /sys/devices/system/cpu/cpuX/cpufreq/energy_performance_preference
>>>>>>>>
>>>>>>>> It also accepts a few string arguments and converts them to 
>>>>>>>> integers.
>>>>>>>>
>>>>>>>> Perhaps the same name should be used, and the semantics made 
>>>>>>>> exactly
>>>>>>>> identical, and then it could be documented as present for either
>>>>>>>> cppc_cpufreq OR intel_pstate?
>>>>>>>>
>>>>>>>> I think would be more elegant if userspace tooling could Just 
>>>>>>>> Work with
>>>>>>>> either driver.
>>>>>>>>
>>>>>>>> One might object that the frequency selection behavior that 
>>>>>>>> results from
>>>>>>>> any particular value of the register itself might be different, 
>>>>>>>> but they
>>>>>>>> are *already* different between Intel's P and E-cores in the 
>>>>>>>> same CPU
>>>>>>>> package. (Ugh.)
>>>>>>> Yes, I should use the same name. Thanks.
>>>>>>>
>>>>>>> As for accepting string arguments and converting them to 
>>>>>>> integers, I don't
>>>>>>> think it is necessary. It'll be a litte confused if someone 
>>>>>>> writes a raw
>>>>>>> value and reads a string I think. I prefer to let users freely 
>>>>>>> set this
>>>>>>> value.
>>>>>>>
>>>>>>> In addition, there are many differences between the 
>>>>>>> implementations of
>>>>>>> energy_performance_preference in intel_pstate and cppc_cpufreq (and
>>>>>>> amd-pstate...). It is really difficult to explain all this 
>>>>>>> differences in
>>>>>>> this document. So I'll leave it to be documented as present for
>>>>>>> cppc_cpufreq only.
>>>>>> At least the interface to userspace I think we should do the best 
>>>>>> we can to be the same between all the drivers if possible.
>>>>>>
>>>>>> For example; I've got a patch that I may bring up in a future 
>>>>>> kernel cycle that adds raw integer writes to amd-pstates 
>>>>>> energy_performance_profile to behave the same way intel-pstate does.
>>>>> I agree that it's better to keep this interface consistent across 
>>>>> different
>>>>> drivers. But in my opinion, the implementation of intel_pstate
>>>>> energy_performance_preference is not really nice. Someone may write 
>>>>> a raw
>>>>> value but read a string, or read strings for some values and read raw
>>>>> values for some other values. It is inconsistent. It may be better 
>>>>> to use
>>>>> some other implementation, such as seperating the operations of r/w 
>>>>> strings
>>>>> and raw values into two files.
>>>> I agree it would be better to be sure of the type to expect when 
>>>> reading the
>>>> energy_performance_preference file. The epp values in the range 
>>>> 0-255 with 0
>>>> being the performance value for all interfaces.
>>>>
>>>> In the current epp strings, it seems there is a big gap between the 
>>>> PERFORMANCE
>>>> and the BALANCE_PERFORMANCE strings. Maybe it would be good to 
>>>> complete it:
>>>> EPP_PERFORMANCE        0x00
>>>> EPP_BALANCE_PERFORMANCE    0x40      // state value changed
>>>> EPP_BALANCE        0x80      // new state
>>>> EPP_BALANCE_POWERSAVE    0xC0
>>>> EPP_POWERSAVE        0xFF
>>>>
>>>> NIT: The mapping seems to be slightly different for intel_pstate and 
>>>> amd-pstate
>>>> currently:
>>>> drivers/cpufreq/amd-pstate.c
>>>> #define AMD_CPPC_EPP_PERFORMANCE        0x00
>>>> #define AMD_CPPC_EPP_BALANCE_PERFORMANCE    0x80
>>>> #define AMD_CPPC_EPP_BALANCE_POWERSAVE        0xBF
>>>> #define AMD_CPPC_EPP_POWERSAVE            0xFF
>>>>
>>>> arch/x86/include/asm/msr-index.h
>>>> #define HWP_EPP_PERFORMANCE        0x00
>>>> #define HWP_EPP_BALANCE_PERFORMANCE    0x80
>>>> #define HWP_EPP_BALANCE_POWERSAVE    0xC0   <------ Different from 
>>>> AMD_CPPC_EPP_BALANCE_POWERSAVE
>>>> #define HWP_EPP_POWERSAVE        0xFF
>>>>
>>>>> I think it's better to consult Rafael and Viresh about how this should
>>>>> evolve.
>>>> Yes indeed
>>> Maybe it's best to discuss what the goal of raw EPP number writes is 
>>> to decide what to do with it.
>>>
>>> IE in intel-pstate is it for userspace to be able to actually utilize 
>>> something besides the strings all the time?  Or is it just for 
>>> debugging to find better values for strings in the future?
>>>
>>> If the former maybe we're better off splitting to 
>>> 'energy_performance_preference' and 'energy_performance_preference_int'.
>>>
>>> If the latter maybe we're better off putting the integer writes and 
>>> reads into debugfs instead and making 'energy_performance_preference' 
>>> return -EINVAL while a non-predefined value is in use.
> 
> In Intel case EPP values can be different based on processor. In some 
> case they they end up sharing the same CPU model. So strings are not 
> suitable for all cases. Also there is different preference of EPP 
> between Chrome systems and non chrome distro. For example Chrome has 
> some resource manager which can change and same on Intel distros with LPMD.
> 

Thanks for confirming it is intentional and changing it would break 
existing userspace.

And FWIW even in Windows there are more than 4 situational values used 
like we have in Linux today.

As the status quo is there I personally feel that we should do the exact 
same for other implementation of energy_performance_preference.

> 
> Thanks,
> 
> Srinivas
> 
> 
>> I think it's the former.
>>
>> I added the author of the patch that allows raw energy performance
>> preference value in intel_pstate to ask about what the goal is and if 
>> this
>> would be ok to do the modification mentioned above.
>>
>> To see the patch from https://lore.kernel.org/ 
>> all/20200626183401.1495090-3-srinivas.pandruvada@...ux.intel.com/
>>
>> Anyway, the purpose of this patch is to allow users write and read raw 
>> EPP
>> number. So maybe I can just rename the file to
>> 'energy_performance_preference_int'?
>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ