[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250327152752.3677034-1-pierre.gondois@arm.com>
Date: Thu, 27 Mar 2025 16:27:51 +0100
From: Pierre Gondois <pierre.gondois@....com>
To: linux-kernel@...r.kernel.org
Cc: Lukasz.Luba@....com,
Chritian Loehle <christian.loehle@....com>,
Hongyan Xia <hongyan.xia2@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Pierre Gondois <pierre.gondois@....com>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>
Subject: [PATCH] sched/fair: Check runnable signal to skip util_est updates
commit 50181c0cff31 ("sched/pelt: Avoid underestimation of task
utilization")
allowed to skip decaying util_est to handle the case where the util_avg
signal of a task is decreased due to the presence of co-scheduled tasks.
In such case, a given task will receive less running time, lowering
its util_avg.
Checking the util_avg and runnable signals are within a certain margin
effectively means that a task received less CPU time that desired.
The margin represents 10 util (=1% * 1024). However there can be 2
different cases:
1.
The task is always running.
In that case, the util_avg value is capped by the relative load of the
CPU. E.g.: three 100% duty_cycle tasks will only reach a peak util_avg
of ~340.
2.
The task is not always running.
In that case, the util_avg value will grow slower and reach a lower
value than if there was no co-scheduled task. However, the util_avg
of the task is not capped.
This patch aims to only prevent util_est from decaying in the case 1.
Indeed, in the PELT computation, the last 4ms impact signals for
respectively:
1ms: 22, 2ms: 21, 3ms: 21, 4ms: 20
I.e. a co-scheduled task will create a delta between the runnable and
util_avg signals of 84 (=22 + 21 + 21 + 20) after not running during
4ms.
Thus, a delta of 10 between the runnable/util_avg signal the margin
- is easy to reach
- takes time to remove
A task is considered as always running when its runnable signal
reaches ~80% * 1024. The condition is arguable, but the current
condition is easily triggered and maintains an overestimation of the
size of tasks through util_est.
Running 5 iterations of speedometer 2.1 on a Pixel6, based on a 6.12
kernel:
Triggering the condition:
- Base condition: triggered ~47%
- New condition: triggered ~10%
Overutilized state:
- Base condition: OU state ~65% of the time
- New condition: triggered ~57% of the time
Energy (using energy counters):
- Base condition: 99884 +/- 936
- New condition: 98857 +/-1325
Score:
- Base condition: 204 +/- 1.5
- New condition: 201.5 +/-1.4
So the patch lowers the overutilzed state residency and reduces the
score. However, over-estimating tasks can only improve the score.
This patch doesn't solve the initial issue reported by Lukasz Luba at
[1], but another way to detect the initial issue should ideally be
used.
[1] https://lore.kernel.org/lkml/f1b1b663-3a12-9e5d-932b-b3ffb5f02e14@arm.com/
Signed-off-by: Pierre Gondois <pierre.gondois@....com>
---
kernel/sched/fair.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6fab28c3360a..9f5509e3036f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4919,10 +4919,12 @@ static inline void util_est_update(struct cfs_rq *cfs_rq,
goto done;
/*
- * To avoid underestimate of task utilization, skip updates of EWMA if
- * we cannot grant that thread got all CPU time it wanted.
+ * Prevent util_est from decaying when the task is considered as always
+ * running, i.e. its runnable reaches 80% of the max. capacity. In that
+ * case, co-scheduled tasks prevent util_avg to grow and reach its peak,
+ * leading to a lower util_est.
*/
- if ((dequeued + UTIL_EST_MARGIN) < task_runnable(p))
+ if (!fits_capacity(task_runnable(p), SCHED_CAPACITY_SCALE))
goto done;
--
2.25.1
Powered by blists - more mailing lists