[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b150332e-c80a-492d-bef8-aeb2f28ec8a0@nalramli.com>
Date: Tue, 29 Oct 2024 01:56:53 -0400
From: "Nabil S. Alramli" <dev@...ramli.com>
To: Mario Limonciello <mario.limonciello@....com>
Cc: "nalramli@...tly.com" <nalramli@...tly.com>,
"jdamato@...tly.com" <jdamato@...tly.com>,
"khubert@...tly.com" <khubert@...tly.com>, "Yuan, Perry"
<Perry.Yuan@....com>, "Shenoy, Gautham Ranjal" <gautham.shenoy@....com>,
"Meng, Li (Jassmine)" <Li.Meng@....com>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
"Huang, Ray" <Ray.Huang@....com>, "rafael@...nel.org" <rafael@...nel.org>,
"viresh.kumar@...aro.org" <viresh.kumar@...aro.org>,
"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 6.1.y 0/1] cpufreq: amd-pstate: Enable CPU boost in
passive and guided modes
Mario,
Thank you so much for your valuable feedback. This all makes sense, and
I will incorporate it into another revision soon.
On 10/29/2024 12:09 AM, Mario Limonciello wrote:
> On 10/28/2024 16:33, Nabil S. Alramli wrote:
>> Hi Mario,
>>
>> Thank you for taking a look at my patch.
>>
>> What do you think about the following for the commit message in the next
>> revision of the PATCH, and omitting the cover letter since most of it is
>> incorporated here?
>>
>> ***********************************************************************
>>
>> cpufreq: amd-pstate: Enable CPU boost in passive and guided modes
>>
>> The CPU frequency cannot be boosted when using the amd_pstate driver in
>> passive or guided mode. This is fixed here.
>
> No need to say things like "I did this" or "this patch does that".
> Just drop last sentence.
>
My apologies. I was just trying to be clear as to what this patch does.
I will drop it.
>>
>> For example, on a host that has AMD EPYC 7662 64-Core processor without
>> this patch running at full CPU load:
> "On a host that has an AMD EPYC 7662 processor while running with
> amd-pstate configured for passive mode on full CPU load the processor
> only reaches 2.0 GHz."
>
>>
>> $ for i in $(cat
>> /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq); \
>> do ni=$(echo "scale=1; $i/1000000" | bc -l); echo "$ni GHz"; done | \
>> sort | uniq -c
>>
>> 128 2.0 GHz
>>
>> And with this patch:
>
> On later kernels the CPU can reach 3.3GHz.
>
>>
>> $ for i in $(cat
>> /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq); \
>> do ni=$(echo "scale=1; $i/1000000" | bc -l); echo "$ni GHz"; done | \
>> sort | uniq -c
>>
>> 128 3.3 GHz
>>
>> The CPU frequency is dependent on a setting called highest_perf which is
>> the multiplier used to compute it. The highest_perf value comes from
>> cppc_init_perf when the driver is built-in and from pstate_init_perf when
>> it is a loaded module. Both of these calls have the following condition:
>>
>> highest_perf = amd_get_highest_perf();
>> if (highest_perf > __cppc_highest_perf_)
>> highest_perf = __cppc_highest_perf;
>>
>> Where again __cppc_highest_perf is either the return from
>> cppc_get_perf_caps in the built-in case or AMD_CPPC_HIGHEST_PERF in the
>> module case. Both of these functions actually return the nominal value,
>> Whereas the call to amd_get_highest_perf returns the correct boost
>> value, so the condition tests true and highest_perf always ends up being
>> the nominal value, therefore never having the ability to boost CPU
>> frequency.
>>
>> Since amd_get_highest_perf already returns the boost value we should
>> just eliminate this check.
>>
>> The issue was introduced in v6.1 via commit bedadcfb011f ("cpufreq:
>> amd-pstate: Fix initial highest_perf value"), and exists in stable
>> kernels
>
> "In stable 6.1" kernels.
>
>>
>> In v6.6.51, a large change, commit 1ec40a175a48 ("cpufreq: amd-pstate:
>> Enable amd-pstate preferred core support"), was introduced which
>> significantly refactored the code. This commit cannot be ported back on
>> its own, and would require reviewing and cherry picking at least a few
>> dozen of commits in cpufreq, amd-pstate, ACPI, CPPC.
>>
> I'd just say "this has been fixed in 6.6.y and newer but due to
> refactoring that change isn't feasible to bring back to 6.1.y"
>
>> This means kernels v6.1 up until v6.6.51 are affected by this
>> significant performance issue, and cannot be easily remediated. This
>> patch simplifies the fix to a single commit.
>
> Again no need to say "this patch".
Understood. As I stated this was just for clarity as to why the patch
may be needed or useful.
>
>>
>> ***********************************************************************
>>
>> On 10/28/2024 4:07 PM, Mario Limonciello wrote:
>>> On 10/24/2024 22:23, Yuan, Perry wrote:
>>>> [AMD Official Use Only - AMD Internal Distribution Only]
>>>>
>>>>> -----Original Message-----
>>>>> From: Nabil S. Alramli <dev@...ramli.com>
>>>>> Sent: Friday, October 25, 2024 9:05 AM
>>>>> To: stable@...r.kernel.org
>>>>> Cc: nalramli@...tly.com; jdamato@...tly.com; khubert@...tly.com;
>>>>> Yuan, Perry
>>>>> <Perry.Yuan@....com>; Meng, Li (Jassmine) <Li.Meng@....com>; Huang,
>>>>> Ray
>>>>> <Ray.Huang@....com>; rafael@...nel.org; viresh.kumar@...aro.org;
>>>>> linux-
>>>>> pm@...r.kernel.org; linux-kernel@...r.kernel.org; Nabil S. Alramli
>>>>> <dev@...ramli.com>
>>>>> Subject: [RFC PATCH 6.1.y 0/1] cpufreq: amd-pstate: Enable CPU boost
>>>>> in passive
>>>>> and guided modes
>>>>>
>>>>> Greetings,
>>>>>
>>>>> This is a RFC for a maintenance patch to an issue in the amd_pstate
>>>>> driver where
>>>>> CPU frequency cannot be boosted in passive or guided modes. Without
>>>>> this patch,
>>>>> AMD machines using stable kernels are unable to get their CPU
>>>>> frequency boosted,
>>>>> which is a significant performance issue.
>>>>>
>>>>> For example, on a host that has AMD EPYC 7662 64-Core processor
>>>>> without this
>>>>> patch running at full CPU load:
>>>>>
>>>>> $ for i in $(cat
>>>>> /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq); \
>>>>> do ni=$(echo "scale=1; $i/1000000" | bc -l); echo "$ni GHz";
>>>>> done | \
>>>>> sort | uniq -c
>>>>>
>>>>> 128 2.0 GHz
>>>>>
>>>>> And with this patch:
>>>>>
>>>>> $ for i in $(cat
>>>>> /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq); \
>>>>> do ni=$(echo "scale=1; $i/1000000" | bc -l); echo "$ni GHz";
>>>>> done | \
>>>>> sort | uniq -c
>>>>>
>>>>> 128 3.3 GHz
>>>>>
>>>>> I am not sure what the correct process is for submitting patches
>>>>> which affect only
>>>>> stable trees but not the current code base, and do not apply to the
>>>>> current tree. As
>>>>> such, I am submitting this directly to stable@, but please let me
>>>>> know if I should be
>>>>> submitting this elsewhere.
>>>>>
>>>>> The issue was introduced in v6.1 via commit bedadcfb011f ("cpufreq:
>>>>> amd-pstate: Fix initial highest_perf value"), and exists in stable
>>>>> kernels up until
>>>>> v6.6.51.
>>>>>
>>>>> In v6.6.51, a large change, commit 1ec40a175a48 ("cpufreq: amd-pstate:
>>>>> Enable amd-pstate preferred core support"), was introduced which
>>>>> significantly
>>>>> refactored the code. This commit cannot be ported back on its own,
>>>>> and would
>>>>> require reviewing and cherry picking at least a few dozen of commits
>>>>> in cpufreq,
>>>>> amd-pstate, ACPI, CPPC.
>>>>>
>>>>> This means kernels v6.1 up until v6.6.51 are affected by this
>>>>> significant
>>>>> performance issue, and cannot be easily remediated.
>>>>>
>>>>> Thank you for your attention and I look forward to your response in
>>>>> regards to what
>>>>> the best way to proceed is for getting this important performance fix
>>>>> merged.
>>>>>
>>>>> Best Regards,
>>>>>
>>>>> Nabil S. Alramli (1):
>>>>> cpufreq: amd-pstate: Enable CPU boost in passive and guided modes
>>>>>
>>>>> drivers/cpufreq/amd-pstate.c | 8 ++------
>>>>> 1 file changed, 2 insertions(+), 6 deletions(-)
>>>>>
>>>>> --
>>>>> 2.35.1
>>>>
>>>> Add Mario and Gautham for any help.
>>>>
>>>> Perry.
>>>>
>>>
>>> If doing a patch that is only for 6.1.y then I think that some more of
>>> this information from the cover letter needs to push into the patch
>>> itself.
>>>
>>> But looking over the patch and considering how much we've changed this
>>> in the newer kernels I think it is a sensible localized change for
>>> 6.1.y.
>>>
>>> As this is fixed in 6.6.51 via a more complete backport patch please
>>> only tag 6.1 in your "Cc: stable@...r.kernel.org" from the patch.
>>>
>
Powered by blists - more mailing lists