[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20260109073518.2843689-1-dengjun.su@mediatek.com>
Date: Fri, 9 Jan 2026 15:35:16 +0800
From: Dengjun Su <dengjun.su@...iatek.com>
To: <linux-kernel@...r.kernel.org>, Peter Zijlstra <peterz@...radead.org>
CC: <dengjun.su@...iatek.com>, <mike.zhang@...iatek.com>,
<haiqiang.gong@...iatek.com>, <peijun.huang@...iatek.com>, Ingo Molnar
<mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot
<vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel
Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, Matthias
Brugger <matthias.bgg@...il.com>, AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>,
<linux-arm-kernel@...ts.infradead.org>, <linux-mediatek@...ts.infradead.org>
Subject: [PATCH v2] sched: fix incorrect schedstats for rt and dl thread
For RT and DL thread, only 'set_next_task_(rt/dl)' will call
'update_stats_wait_end_(rt/dl)' to update schedstats information.
However, during the migration process,
'update_stats_wait_start_(rt/dl)' will be called twice, which
will cause the values of wait_max and wait_sum to be incorrect.
The specific output as follows:
$ cat /proc/6046/task/6046/sched | grep wait
wait_start : 0.000000
wait_max : 496717.080029
wait_sum : 7921540.776553
A complete schedstats information update flow of migrate should be
__update_stats_wait_start() [enter queue A, stage 1] ->
__update_stats_wait_end() [leave queue A, stage 2] ->
__update_stats_wait_start() [enter queue B, stage 3] ->
__update_stats_wait_end() [start running on queue B, stage 4]
Stage 1: prev_wait_start is 0, and in the end, wait_start records the
time of entering the queue.
Stage 2: task_on_rq_migrating(p) is true, and wait_start is updated to
the waiting time on queue A.
Stage 3: prev_wait_start is the waiting time on queue A, wait_start is
the time of entering queue B, and wait_start is expected to be greater
than prev_wait_start. Under this condition, wait_start is updated to
(the moment of entering queue B) - (the waiting time on queue A).
Stage 4: the final wait time = (time when starting to run on queue B)
- (time of entering queue B) + (waiting time on queue A) = waiting
time on queue B + waiting time on queue A.
The current problem is that stage 2 does not call __update_stats_wait_end
to update wait_start, which causes the final computed wait time = waiting
time on queue B + the moment of entering queue A, leading to incorrect
wait_max and wait_sum.
Add 'update_stats_wait_end_(rt/dl)' in 'update_stats_dequeue_(rt/dl)' to
update schedstats information when dequeue_task.
Signed-off-by: Dengjun Su <dengjun.su@...iatek.com>
---
v2: changes according to Peter's suggestions
1. add fix for dl class;
2. add more commit message to explain the reasons for the
modifications.
---
kernel/sched/deadline.c | 4 ++++
kernel/sched/rt.c | 7 ++++++-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 319439fe1870..610243a56bf8 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2115,10 +2115,14 @@ update_stats_dequeue_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se,
int flags)
{
struct task_struct *p = dl_task_of(dl_se);
+ struct rq *rq = rq_of_dl_rq(dl_rq);
if (!schedstat_enabled())
return;
+ if (p != rq->curr)
+ update_stats_wait_end_dl(dl_rq, dl_se);
+
if ((flags & DEQUEUE_SLEEP)) {
unsigned int state;
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index f1867fe8e5c5..12f2efddca9f 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1302,13 +1302,18 @@ update_stats_dequeue_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se,
int flags)
{
struct task_struct *p = NULL;
+ struct rq *rq = rq_of_rt_rq(rt_rq);
if (!schedstat_enabled())
return;
- if (rt_entity_is_task(rt_se))
+ if (rt_entity_is_task(rt_se)) {
p = rt_task_of(rt_se);
+ if (p != rq->curr)
+ update_stats_wait_end_rt(rt_rq, rt_se);
+ }
+
if ((flags & DEQUEUE_SLEEP) && p) {
unsigned int state;
--
2.43.0
Powered by blists - more mailing lists