[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240611112413.1241352-1-christian.loehle@arm.com>
Date: Tue, 11 Jun 2024 12:24:10 +0100
From: Christian Loehle <christian.loehle@....com>
To: linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org,
rafael@...nel.org
Cc: vincent.guittot@...aro.org,
qyousef@...alina.io,
peterz@...radead.org,
daniel.lezcano@...aro.org,
ulf.hansson@...aro.org,
anna-maria@...utronix.de,
kajetan.puchalski@....com,
lukasz.luba@....com,
dietmar.eggemann@....com,
Christian Loehle <christian.loehle@....com>
Subject: [PATCHv2 0/3] cpuidle: teo: Fixing utilization and intercept logic
Hi all,
so my investigation into teo lead to the following fixes.
1/3:
As discussed the utilization threshold is too high while
there are benefits in certain workloads, there are quite a few
regressions, too. Revert the Util-awareness patch.
This in itself leads to regressions, but part of it can be offset
by the later patches.
See
https://lore.kernel.org/lkml/CAKfTPtA6ZzRR-zMN7sodOW+N_P+GqwNv4tGR+aMB5VXRT2b5bg@mail.gmail.com/
2/3:
Remove the 'recent' intercept logic, see my findings in:
https://lore.kernel.org/lkml/0ce2d536-1125-4df8-9a5b-0d5e389cd8af@arm.com/
I haven't found a way to salvage this properly, so I removed it.
The regular intercept seems to decay fast enough to not need this, but
we could change it if that turns out that we need this to be faster in
ramp-up and decaying.
3/3:
The rest of the intercept logic had issues, too.
See the commit.
Happy for anyone to take a look and test as well.
Some numbers for context, comparing:
- IO workload (intercept heavy).
- Timer workload very low utilization (check for deepest state)
- hackbench (high utilization)
- Geekbench 5 on Pixel6 (high utilization)
Tests 1 to 3 are on RK3399 with CONFIG_HZ=100.
target_residencies: 1, 900, 2000
1. IO workload, 5 runs, results sorted, in read IOPS.
fio --minimal --time_based --name=fiotest --filename=/dev/nvme0n1 --runtime=30 --rw=randread --bs=4k --ioengine=psync --iodepth=1 --direct=1 | cut -d \; -f 8;
teo fixed v2:
/dev/nvme0n1
[4599, 4658, 4692, 4694, 4720]
/dev/mmcblk2
[5700, 5730, 5735, 5747, 5977]
/dev/mmcblk1
[2052, 2054, 2066, 2067, 2073]
teo mainline:
/dev/nvme0n1
[3793, 3825, 3846, 3865, 3964]
/dev/mmcblk2
[3831, 4110, 4154, 4203, 4228]
/dev/mmcblk1
[1559, 1564, 1596, 1611, 1618]
menu:
/dev/nvme0n1
[2571, 2630, 2804, 2813, 2917]
/dev/mmcblk2
[4181, 4260, 5062, 5260, 5329]
/dev/mmcblk1
[1567, 1581, 1585, 1603, 1769]
2. Timer workload (through IO for my convenience 😉 )
Results in read IOPS, fio same as above.
echo "0 2097152 zero" | dmsetup create dm-zeros
echo "0 2097152 delay /dev/mapper/dm-zeros 0 50" | dmsetup create dm-slow
(Each IO is delayed by timer of 50ms, should be mostly in state2, for 5s total)
teo fixed v2:
idle_state time
2.0 4.807025
-1.0 0.219766
0.0 0.072007
1.0 0.169570
3199 cpu_idle total
38 cpu_idle_miss
31 cpu_idle_miss above
7 cpu_idle_miss below
teo mainline:
idle_state time
1.0 4.897942
-1.0 0.095375
0.0 0.253581
3221 cpu_idle total
1269 cpu_idle_miss
22 cpu_idle_miss above
1247 cpu_idle_miss below
menu:
idle_state time
2.0 4.295546
-1.0 0.234164
1.0 0.356344
0.0 0.401507
3421 cpu_idle total
129 cpu_idle_miss
52 cpu_idle_miss above
77 cpu_idle_miss below
Residencies:
teo mainline isn't in state2 at all, teo fixed is more in state2 than menu, but
both are in state2 the vast majority of the time as expected.
tldr: overall teo fixed spends more time in state2 while having
fewer idle_miss than menu.
teo mainline was just way too aggressive at selecting shallow states.
3. Hackbench, 5 runs
for i in $(seq 0 4); do hackbench -l 100 -g 100 ; sleep 1; done
teo fixed v2:
Time: 4.937
Time: 4.898
Time: 4.871
Time: 4.833
Time: 4.898
teo mainline:
Time: 4.945
Time: 5.021
Time: 4.927
Time: 4.923
Time: 5.137
menu:
Time: 4.964
Time: 4.847
Time: 4.914
Time: 4.841
Time: 4.800
tldr: all comparable, teo mainline slightly worse
4. Geekbench 5 (multi-core) on Pixel6
(Device is cooled for each iteration separately)
teo mainline:
3113, 3068, 3079
mean 3086.66
teo revert util-awareness:
2947, 2941, 2952
mean 2946.66 (-4.54%)
teo fixed v2:
3032, 3066, 3019
mean 3039 (-1.54%)
Changes since v1:
- Removed all non-fixes.
- Do a full revert of Util-awareness instead of increasing thresholds.
- Address Dietmar's comments.
https://lore.kernel.org/linux-kernel/20240606090050.327614-2-christian.loehle@arm.com/T/
Kind Regards,
Christian
Christian Loehle (3):
Revert: "cpuidle: teo: Introduce util-awareness"
cpuidle: teo: Remove recent intercepts metric
cpuidle: teo: Don't count non-existent intercepts
drivers/cpuidle/governors/teo.c | 189 +++++---------------------------
1 file changed, 27 insertions(+), 162 deletions(-)
--
2.34.1
Powered by blists - more mailing lists