[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240727105029.315205425@infradead.org>
Date: Sat, 27 Jul 2024 12:27:42 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: mingo@...hat.com,
peterz@...radead.org,
juri.lelli@...hat.com,
vincent.guittot@...aro.org,
dietmar.eggemann@....com,
rostedt@...dmis.org,
bsegall@...gle.com,
mgorman@...e.de,
vschneid@...hat.com,
linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com,
wuyun.abel@...edance.com,
youssefesmat@...omium.org,
tglx@...utronix.de,
efault@....de,
Luis Machado <luis.machado@....com>,
Hongyan Xia <hongyan.xia2@....com>
Subject: [PATCH 10/24] sched/uclamg: Handle delayed dequeue
Delayed dequeue has tasks sit around on the runqueue that are not
actually runnable -- specifically, they will be dequeued the moment
they get picked.
One side-effect is that such a task can get migrated, which leads to a
'nested' dequeue_task() scenario that messes up uclamp if we don't
take care.
Notably, dequeue_task(DEQUEUE_SLEEP) can 'fail' and keep the task on
the runqueue. This however will have removed the task from uclamp --
per uclamp_rq_dec() in dequeue_task(). So far so good.
However, if at that point the task gets migrated -- or nice adjusted
or any of a myriad of operations that does a dequeue-enqueue cycle --
we'll pass through dequeue_task()/enqueue_task() again. Without
modification this will lead to a double decrement for uclamp, which is
wrong.
Reported-by: Luis Machado <luis.machado@....com>
Reported-by: Hongyan Xia <hongyan.xia2@....com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
---
kernel/sched/core.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1676,6 +1676,9 @@ static inline void uclamp_rq_inc(struct
if (unlikely(!p->sched_class->uclamp_enabled))
return;
+ if (p->se.sched_delayed)
+ return;
+
for_each_clamp_id(clamp_id)
uclamp_rq_inc_id(rq, p, clamp_id);
@@ -1700,6 +1703,9 @@ static inline void uclamp_rq_dec(struct
if (unlikely(!p->sched_class->uclamp_enabled))
return;
+ if (p->se.sched_delayed)
+ return;
+
for_each_clamp_id(clamp_id)
uclamp_rq_dec_id(rq, p, clamp_id);
}
@@ -1979,8 +1985,12 @@ void enqueue_task(struct rq *rq, struct
psi_enqueue(p, (flags & ENQUEUE_WAKEUP) && !(flags & ENQUEUE_MIGRATED));
}
- uclamp_rq_inc(rq, p);
p->sched_class->enqueue_task(rq, p, flags);
+ /*
+ * Must be after ->enqueue_task() because ENQUEUE_DELAYED can clear
+ * ->sched_delayed.
+ */
+ uclamp_rq_inc(rq, p);
if (sched_core_enabled(rq))
sched_core_enqueue(rq, p);
@@ -2002,6 +2012,10 @@ inline bool dequeue_task(struct rq *rq,
psi_dequeue(p, flags & DEQUEUE_SLEEP);
}
+ /*
+ * Must be before ->dequeue_task() because ->dequeue_task() can 'fail'
+ * and mark the task ->sched_delayed.
+ */
uclamp_rq_dec(rq, p);
return p->sched_class->dequeue_task(rq, p, flags);
}
Powered by blists - more mailing lists