linux-kernel - Re: stable 6.6: commit "sched/cpufreq: Rework schedutil governor performance estimation' causes a regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKchOA31NGBWMdeSjky7MwOjU=dYmHVLbE7uUQHUXSZOzUHUeA@mail.gmail.com>
Date: Tue, 25 Nov 2025 21:01:30 +0800
From: Yu-Che Cheng <giver@...omium.org>
To: Lukasz Luba <lukasz.luba@....com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, Vincent Guittot <vincent.guittot@...aro.org>, 
	Christian Loehle <christian.loehle@....com>, "Rafael J. Wysocki" <rafael@...nel.org>, 
	Viresh Kumar <viresh.kumar@...aro.org>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, 
	Tomasz Figa <tfiga@...omium.org>, stable@...r.kernel.org, linux-pm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: stable 6.6: commit "sched/cpufreq: Rework schedutil governor
 performance estimation' causes a regression

Hi Lukasz,

On Tue, Nov 25, 2025 at 5:45 PM Lukasz Luba <lukasz.luba@....com> wrote:
>
> Hi Sergey,
>
> On 11/21/25 03:55, Sergey Senozhatsky wrote:
> > Hi Christian,
> >
> > On (25/11/20 10:15), Christian Loehle wrote:
> >> On 11/20/25 04:45, Sergey Senozhatsky wrote:
> >>> Hi,
> >>>
> >>> We are observing a performance regression on one of our arm64 boards.
> >>> We tracked it down to the linux-6.6.y commit ada8d7fa0ad4 ("sched/cpufreq:
> >>> Rework schedutil governor performance estimation").
> >>>
> >>> UI speedometer benchmark:
> >>> w/commit:   395  +/-38
> >>> w/o commit: 439  +/-14
> >>>
> >>
> >> Hi Sergey,
> >> Would be nice to get some details. What board?
> >
> > It's an MT8196 chromebook.
> >
> >> What do the OPPs look like?
> >
> > How do I find that out?
> >
> >> Does this system use uclamp during the benchmark? How?
> >
> > How do I find that out?
> >
> >> Given how large the stddev given by speedometer (version 3?) itself is, can we get the
> >> stats of a few runs?
> >
> > v2.1
> >
> > w/o patch     w/ patch
> > 440 +/-30     406 +/-11
> > 440 +/-14     413 +/-16
> > 444 +/-12     403 +/-14
> > 442 +/-12     412 +/-15
> >
> >> Maybe traces of cpu_frequency for both w/ and w/o?
> >
> > trace-cmd record -e power:cpu_frequency attached.
> >
> > "base" is with ada8d7fa0ad4
> > "revert" is ada8d7fa0ad4 reverted.
>
>
> I did some analysis based on your trace files.
> I have been playing some time ago with speedometer performance
> issues so that's why I'm curious about your report here.
>
> I've filtered your trace purely based on cpu7 (the single biggest cpu).
> Then I have cut the data from the 'warm-up' phase in both traces, to
> have similar start point (I think).
>
> It looks like the 2 traces can show similar 'pattern' of that benchmark
> which is good for analysis. If you align the timestamp:
> 176.051s and 972.465s then both plots (frequency changes in time) look
> similar.
>
> There are some differences, though:
> 1. there are more deeps in the freq in time, so more often you would
>     pay extra penalty for the ramp-up again
> 2. some of the ramp-up phases are a bit longer ~100ms instead of ~80ms
>     going from 2GHz to 3.6GHz

Agree. From the visualized frequency changes in the Perfetto traces,
it's more obvious that the ramp-up from 2GHz to 3.6GHz becomes much
slower and a bit unstable in v6.6.99, and it's also easier to go down
to a low frequency after a short idle.

> 3.
>
>
> There are idle phases missing in the trace, so we have to be careful
> when e.g. comparing avg frequency, because that might not be the real
> indication of the delivered computation and not indicate the gap in the
> score.
>
> Here are the stats:
> 1. revert:
> frequency
> count  1.318000e+03
> mean   2.932240e+06
> std    5.434045e+05
> min    2.000000e+06
> 50%    3.000000e+06
> 85%    3.600000e+06
> 90%    3.626000e+06
> 95%    3.626000e+06
> 99%    3.626000e+06
> max    3.626000e+06
>
> 2. base:
>            frequency
> count  1.551000e+03
> mean   2.809391e+06
> std    5.369750e+05
> min    2.000000e+06
> 50%    2.800000e+06
> 85%    3.500000e+06
> 90%    3.600000e+06
> 95%    3.626000e+06
> 99%    3.626000e+06
> max    3.626000e+06
>
>
> A better indication in this case would be comparison of the frequency
> residency in time, especially for the max freq:
> 1. revert: 11.92s
> 2. base: 9.11s
>
> So there is 2.8s longer residency for that fmax (while we even have
> longer period for finishing that Speedometer 2 test on 'base').
>
> Here is some detail about that run*:
> +---------------+---------------------+---------------+----------------+
> | Trace         | Total Trace         | Time at Max   | % of Total     |
> |               | Duration (s)        | Freq (s)      | Time           |
> +---------------+---------------------+---------------+----------------+
> | Base Trace    | 24.72               | 9.11          | 36.9%          |
> | Revert Trace  | 22.88               | 11.92         | 52.1%          |
> +---------------+---------------------+---------------+----------------+
>
> *We don't know the idle periods which might happen for those frequencies
>
>
> I wonder if you had a fix patch for the util_est in your kernel...
> That fix has been recently backported to 6.6 stable [1].
>
> You might want to try that patch as well, w/ or w/o this revert.
> IMHO it might be worth to have it on top. It might help
> the main Chrome task ('CrRendererMain') to stay longer on the biggest
> cpu, since the util_est would be higher. You can read the discussion
> that I had back then with PeterZ and VincentG [2].

No, the util_est fix isn't in our kernel yet.
It looks like after cherry-picking the fix, without the revert, the
Speedometer 2.0 score becomes even slightly higher than that on
v6.6.88 (450 ~ 460 vs 435 ~ 440).
On the other hand, with both the fix and the revert, the Speedometer
score becomes about 475 ~ 480, which is almost the same as using the
performance governor (i.e. pinning at the maximum frequency).
It looks like more tasks that originally run on the little cores are
migrated to the middle and big cores more often, which also makes CPU7
more likely to stay at a higher frequency during some short idle in
the main thread.

Also attach the Perfetto trace for both of them:

fix without revert:
https://ui.perfetto.dev/#!/?s=ff4d10bd58982555eada61648786adf6f7187ac3
fix with revert:
https://ui.perfetto.dev/#!/?s=05da3cedfb3851ad694f523ef59d3cd1092d74ae

>
> Regards,
> Lukasz
>
> [1]
> https://lore.kernel.org/stable/20251121130232.828187990@linuxfoundation.org/
> [2]
> https://lore.kernel.org/lkml/20230912142821.GA22166@noisy.programming.kicks-ass.net/

Best regards,
Yu-Che