[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <001b01da5ea7$86c7a070$9456e150$@telus.net>
Date: Tue, 13 Feb 2024 10:07:36 -0800
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Vincent Guittot'" <vincent.guittot@...aro.org>
Cc: "'Ingo Molnar'" <mingo@...nel.org>,
"'Rafael J. Wysocki'" <rafael@...nel.org>,
<linux-pm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>,
"Doug Smythies" <dsmythies@...us.net>
Subject: RE: sched/cpufreq: Rework schedutil governor performance estimation - Regression bisected
On 2024.02.13 03:27 Vincent wrote:
> On Sun, 11 Feb 2024 at 17:43, Doug Smythies <dsmythies@...us.net> wrote:
>> On 2024.02.11 05:36 Vincent wrote:
>>> On Sat, 10 Feb 2024 at 00:16, Doug Smythies <dsmythies@...us.net> wrote:
>>>> On 2024.02.09.14:11 Vincent wrote:
>>>>> On Fri, 9 Feb 2024 at 22:38, Doug Smythies <dsmythies@...us.net> wrote:
>>>>>>
>>>>>> I noticed a regression in the 6.8rc series kernels. Bisecting the kernel pointed to:
>>>>>>
>>>>>> # first bad commit: [9c0b4bb7f6303c9c4e2e34984c46f5a86478f84d]
>>>>>> sched/cpufreq: Rework schedutil governor performance estimation
>>>>>>
>>>>>> There was previous bisection and suggestion of reversion,
>>>>>> but I guess it wasn't done in the end. [1]
>>>>>
>>>>> This has been fixed with
>>>>> https://lore.kernel.org/all/170539970061.398.16662091173685476681.tip-bot2@tip-bot2/
>>>>
>>>> Okay, thanks. I didn't find that one.
>>>>
>>>>>> The regression: reduced maximum CPU frequency is ignored.
>>
>> Perhaps I should have said "sometimes ignored".
>> With a maximum CPU frequency for all CPUs set to 2.4 GHz and
>> a 100% load on CPU 5, its frequency was sampled 1000 times:
>> 28.6% of samples were 2.4 GHz.
>> 71.4% of samples were 4.8 GHz (the max turbo frequency)
>> The results are highly non-repeatable, for example another sample:
>> 32.8% of samples were 2.4 GHz.
>> 76.2% of samples were 4.8 GHz
>>
>> Another interesting side note: If load is added to the other CPUs,
>> the set maximum CPU frequency is enforced.
>
> Could you trace cpufreq and pstate ? I'd like to understand how
> policy->cur can be changed
> whereas there is this comment in intel_pstate_set_policy():
> /*
> * policy->cur is never updated with the intel_pstate driver, but it
> * is used as a stale frequency value. So, keep it within limits.
> */
>
> but cpufreq_driver_fast_switch() updates it with the freq returned by
> intel_cpufreq_fast_switch()
Perhaps I should submit a patch clarifying that comment.
It is true for the "intel_pstate" CPU frequency scaling driver but not for the
"intel_cpufreq" CPU frequency scaling driver, also known as the intel_pstate
driver in passive mode. Sorry for any confusion.
I ran the intel_pstate_tracer.py during the test and do observe many, but
not all, CPUs requesting pstate 48 when the max is set to 24.
The calling request seems to always be via "fast_switch" path.
The root issue here appears to be a limit clamping problem for that path.
I'll try to attach a couple of graphs and screen shots from the tracer data.
I do not know how to trace cpufreq at the same time.
.. Doug
Download attachment "all_cpu_pstates.png" of type "image/png" (23829 bytes)
Download attachment "cpu5-example.png" of type "image/png" (66432 bytes)
Powered by blists - more mailing lists