linux-kernel - [PATCHv3 0/3] cpuidle: teo: Fixing utilization and intercept logic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240628095955.34096-1-christian.loehle@arm.com>
Date: Fri, 28 Jun 2024 10:59:52 +0100
From: Christian Loehle <christian.loehle@....com>
To: linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	rafael@...nel.org
Cc: vincent.guittot@...aro.org,
	qyousef@...alina.io,
	peterz@...radead.org,
	daniel.lezcano@...aro.org,
	ulf.hansson@...aro.org,
	anna-maria@...utronix.de,
	dsmythies@...us.net,
	kajetan.puchalski@....com,
	lukasz.luba@....com,
	dietmar.eggemann@....com,
	Christian Loehle <christian.loehle@....com>
Subject: [PATCHv3 0/3] cpuidle: teo: Fixing utilization and intercept logic

Hi all,
so my investigation into teo lead to the following fixes.

1/3:
As discussed the utilization threshold is too high while
there are benefits in certain workloads, there are quite a few
regressions, too. Revert the Util-awareness patch.
This in itself leads to regressions, but part of it can be offset
by the later patches.
See
https://lore.kernel.org/lkml/CAKfTPtA6ZzRR-zMN7sodOW+N_P+GqwNv4tGR+aMB5VXRT2b5bg@mail.gmail.com/
2/3:
Remove the 'recent' intercept logic, see my findings in:
https://lore.kernel.org/lkml/0ce2d536-1125-4df8-9a5b-0d5e389cd8af@arm.com/
I haven't found a way to salvage this properly, so I removed it.
The regular intercept seems to decay fast enough to not need this, but
we could change it if that turns out that we need this to be faster in
ramp-up and decaying.
3/3:
The rest of the intercept logic had issues, too.
See the commit.

Happy for anyone to take a look and test as well.

Some numbers for context, comparing:
- IO workload (intercept heavy).
- Timer workload very low utilization (check for deepest state)
- hackbench (high utilization)
- Geekbench 5 on Pixel6 (high utilization)
Tests 1 to 3 are on RK3399 with CONFIG_HZ=100.
target_residencies: 1, 900, 2000

1. IO workload, 5 runs, results sorted, in read IOPS.
fio --minimal --time_based --name=fiotest --filename=/dev/nvme0n1 --runtime=30 --rw=randread --bs=4k --ioengine=psync --iodepth=1 --direct=1 | cut -d \; -f 8;

teo fixed v2:
/dev/nvme0n1
[4599, 4658, 4692, 4694, 4720]
/dev/mmcblk2
[5700, 5730, 5735, 5747, 5977]
/dev/mmcblk1
[2052, 2054, 2066, 2067, 2073]

teo mainline:
/dev/nvme0n1
[3793, 3825, 3846, 3865, 3964]
/dev/mmcblk2
[3831, 4110, 4154, 4203, 4228]
/dev/mmcblk1
[1559, 1564, 1596, 1611, 1618]

menu:
/dev/nvme0n1
[2571, 2630, 2804, 2813, 2917]
/dev/mmcblk2
[4181, 4260, 5062, 5260, 5329]
/dev/mmcblk1
[1567, 1581, 1585, 1603, 1769]

2. Timer workload (through IO for my convenience 😉 )
Results in read IOPS, fio same as above.
echo "0 2097152 zero" | dmsetup create dm-zeros
echo "0 2097152 delay /dev/mapper/dm-zeros 0 50" | dmsetup create dm-slow
(Each IO is delayed by timer of 50ms, should be mostly in state2, for 5s total)

teo fixed v2:
idle_state time
2.0 	4.807025
-1.0 	0.219766
0.0 	0.072007
1.0 	0.169570

3199 cpu_idle total
38 cpu_idle_miss
31 cpu_idle_miss above
7 cpu_idle_miss below

teo mainline:
idle_state time
1.0 	4.897942
-1.0 	0.095375
0.0 	0.253581

3221 cpu_idle total
1269 cpu_idle_miss
22 cpu_idle_miss above
1247 cpu_idle_miss below

menu:
idle_state time
2.0 	4.295546
-1.0 	0.234164
1.0 	0.356344
0.0 	0.401507

3421 cpu_idle total
129 cpu_idle_miss
52 cpu_idle_miss above
77 cpu_idle_miss below

Residencies:
teo mainline isn't in state2 at all, teo fixed is more in state2 than menu, but
both are in state2 the vast majority of the time as expected.

tldr: overall teo fixed spends more time in state2 while having
fewer idle_miss than menu.
teo mainline was just way too aggressive at selecting shallow states.

3. Hackbench, 5 runs
for i in $(seq 0 4); do hackbench -l 100 -g 100 ; sleep 1; done

teo fixed v2:
Time: 4.937
Time: 4.898
Time: 4.871
Time: 4.833
Time: 4.898

teo mainline:
Time: 4.945
Time: 5.021
Time: 4.927
Time: 4.923
Time: 5.137

menu:
Time: 4.964
Time: 4.847
Time: 4.914
Time: 4.841
Time: 4.800

tldr: all comparable, teo mainline slightly worse

4. Geekbench 5 (multi-core) on Pixel6
(Device is cooled for each iteration separately)
teo mainline:
3113, 3068, 3079
mean 3086.66

teo revert util-awareness:
2947, 2941, 2952
mean 2946.66 (-4.54%)

teo fixed v2:
3032, 3066, 3019
mean 3039 (-1.54%)

Changes since v2:
- Reworded commits according to Dietmar's comments
- Dropped the KTIME_MAX as hit part from 3/3 according to Dietmar's
remark.

Changes since v1:
- Removed all non-fixes.
- Do a full revert of Util-awareness instead of increasing thresholds.
- Address Dietmar's comments.
https://lore.kernel.org/linux-kernel/20240606090050.327614-2-christian.loehle@arm.com/T/

Kind Regards,
Christian

Christian Loehle (3):
  Revert: "cpuidle: teo: Introduce util-awareness"
  cpuidle: teo: Remove recent intercepts metric
  cpuidle: teo: Don't count non-existent intercepts

 drivers/cpuidle/governors/teo.c | 189 +++++---------------------------
 1 file changed, 27 insertions(+), 162 deletions(-)

--
2.34.1