lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKchOA3Kxc6M+nih7sEinOZnMX3qO60hF+jWHCfW5PZh10F-hA@mail.gmail.com>
Date: Fri, 21 Nov 2025 23:53:43 +0800
From: Yu-Che Cheng <giver@...omium.org>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, Christian Loehle <christian.loehle@....com>, 
	"Rafael J. Wysocki" <rafael@...nel.org>, Viresh Kumar <viresh.kumar@...aro.org>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Tomasz Figa <tfiga@...omium.org>, stable@...r.kernel.org, 
	linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Lukasz Luba <lukasz.luba@....com>, Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: stable 6.6: commit "sched/cpufreq: Rework schedutil governor
 performance estimation' causes a regression

Sorry that I accidentally sent in non-plain-text mode... Resend the
email to the mailing list again.

Hi Vincent,

On Fri, Nov 21, 2025 at 10:00 PM Vincent Guittot
<vincent.guittot@...aro.org> wrote:
>
> On Fri, 21 Nov 2025 at 04:55, Sergey Senozhatsky
> <senozhatsky@...omium.org> wrote:
> >
> > Hi Christian,
> >
> > On (25/11/20 10:15), Christian Loehle wrote:
> > > On 11/20/25 04:45, Sergey Senozhatsky wrote:
> > > > Hi,
> > > >
> > > > We are observing a performance regression on one of our arm64 boards.
> > > > We tracked it down to the linux-6.6.y commit ada8d7fa0ad4 ("sched/cpufreq:
>
> You mentioned that you tracked down to linux-6.6.y but which kernel
> are you using ?
>

We're using ChromeOS 6.6 kernel, which is currently on top of linux-v6.6.99.
But we've tested that the performance regression also happens on
exactly the same scheduler codes (`kernel/sched`) as upstream v6.6.99,
compared to those on v6.6.88.

> > > > Rework schedutil governor performance estimation").
> > > >
> > > > UI speedometer benchmark:
> > > > w/commit:   395  +/-38
> > > > w/o commit: 439  +/-14
> > > >
> > >
> > > Hi Sergey,
> > > Would be nice to get some details. What board?
> >
> > It's an MT8196 chromebook.
> >
> > > What do the OPPs look like?
> >
> > How do I find that out?
>
> In /sys/kernel/debug/opp/cpu*/
> or
> /sys/devices/system/cpu/cpufreq/policy*/scaling_available_frequencies
> with related_cpus
>

The energy model on the device is:

CPU0-3:
+------------+------------+
| freq (khz) | power (uw) |
+============+============+
|     339000 |      34362 |
|     400000 |      42099 |
|     500000 |      52907 |
|     600000 |      63795 |
|     700000 |      74747 |
|     800000 |      88445 |
|     900000 |     101444 |
|    1000000 |     120377 |
|    1100000 |     136859 |
|    1200000 |     154162 |
|    1300000 |     174843 |
|    1400000 |     196833 |
|    1500000 |     217052 |
|    1600000 |     247844 |
|    1700000 |     281464 |
|    1800000 |     321764 |
|    1900000 |     352114 |
|    2000000 |     383791 |
|    2100000 |     421809 |
|    2200000 |     461767 |
|    2300000 |     503648 |
|    2400000 |     540731 |
+------------+------------+

CPU4-6:
+------------+------------+
| freq (khz) | power (uw) |
+============+============+
|     622000 |     131738 |
|     700000 |     147102 |
|     800000 |     172219 |
|     900000 |     205455 |
|    1000000 |     233632 |
|    1100000 |     254313 |
|    1200000 |     288843 |
|    1300000 |     330863 |
|    1400000 |     358947 |
|    1500000 |     400589 |
|    1600000 |     444247 |
|    1700000 |     497941 |
|    1800000 |     539959 |
|    1900000 |     584011 |
|    2000000 |     657172 |
|    2100000 |     746489 |
|    2200000 |     822854 |
|    2300000 |     904913 |
|    2400000 |    1006581 |
|    2500000 |    1115458 |
|    2600000 |    1205167 |
|    2700000 |    1330751 |
|    2800000 |    1450661 |
|    2900000 |    1596740 |
|    3000000 |    1736568 |
|    3100000 |    1887001 |
|    3200000 |    2048877 |
|    3300000 |    2201141 |
+------------+------------+

CPU7:

+------------+------------+
| freq (khz) | power (uw) |
+============+============+
|     798000 |     320028 |
|     900000 |     330714 |
|    1000000 |     358108 |
|    1100000 |     384730 |
|    1200000 |     410669 |
|    1300000 |     438355 |
|    1400000 |     469865 |
|    1500000 |     502740 |
|    1600000 |     531645 |
|    1700000 |     560380 |
|    1800000 |     588902 |
|    1900000 |     617278 |
|    2000000 |     645584 |
|    2100000 |     698653 |
|    2200000 |     744179 |
|    2300000 |     810471 |
|    2400000 |     895816 |
|    2500000 |     985234 |
|    2600000 |    1097802 |
|    2700000 |    1201162 |
|    2800000 |    1332076 |
|    2900000 |    1439847 |
|    3000000 |    1575917 |
|    3100000 |    1741987 |
|    3200000 |    1877346 |
|    3300000 |    2161512 |
|    3400000 |    2437879 |
|    3500000 |    2933742 |
|    3600000 |    3322959 |
|    3626000 |    3486345 |
+------------+------------+

> >
> > > Does this system use uclamp during the benchmark? How?
> >
> > How do I find that out?
>
> it can be set per cgroup
> /sys/fs/cgroup/system.slice/<name>/cpu.uclam.min|max
> or per task with sched_setattr()
>
> You most probably use it because it's the main reason for ada8d7fa0ad4
> to remove wrong overestimate of OPP
>

For the speedometer case, yes, we set the uclamp.min to 20 for the
whole browser and UI (chrome).
There's no system-wide uclamp settings though.

But we also found other performance regressions in an Android guest
VM, where there's no uclamp for the VM and vCPU processes from the
host side.
Particularly, the RAR extraction throughput reduces about 20% in the
RAR app (from RARLAB).
Although it's hard to tell if this is some sort of a side-effect of
the UI regression as the UI is also running at the same time.

> >
> > > Given how large the stddev given by speedometer (version 3?) itself is, can we get the
> > > stats of a few runs?

By the way, it's speedometer version 2.0 (or 2.1).

> >
> > v2.1
> >
> > w/o patch     w/ patch
> > 440 +/-30     406 +/-11
> > 440 +/-14     413 +/-16
> > 444 +/-12     403 +/-14
> > 442 +/-12     412 +/-15
> >
> > > Maybe traces of cpu_frequency for both w/ and w/o?
> >
> > trace-cmd record -e power:cpu_frequency attached.
> >
> > "base" is with ada8d7fa0ad4
> > "revert" is ada8d7fa0ad4 reverted.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ