[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251023040359.39021-1-kprateek.nayak@amd.com>
Date: Thu, 23 Oct 2025 04:03:59 +0000
From: K Prateek Nayak <kprateek.nayak@....com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Sasha Levin
<sashal@...nel.org>, <stable@...r.kernel.org>, Matt Fleming
<matt@...dmodwrite.com>, Ingo Molnar <mingo@...hat.com>, Peter Zijlstra
<peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot
<vincent.guittot@...aro.org>, <linux-kernel@...r.kernel.org>
CC: Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt
<rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
<kernel-team@...udflare.com>, Matt Fleming <mfleming@...udflare.com>, "Oleg
Nesterov" <oleg@...hat.com>, John Stultz <jstultz@...gle.com>, Chris Arges
<carges@...udflare.com>, "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>,
"K Prateek Nayak" <kprateek.nayak@....com>
Subject: [PATCH 6.17] sched/fair: Block delayed tasks on throttled hierarchy during dequeue
Dequeuing a fair task on a throttled hierarchy returns early on
encountering a throttled cfs_rq since the throttle path has already
dequeued the hierarchy above and has adjusted the h_nr_* accounting till
the root cfs_rq.
dequeue_entities() crucially misses calling __block_task() for delayed
tasks being dequeued on the throttled hierarchies, but this was mostly
harmless until commit b7ca5743a260 ("sched/core: Tweak
wait_task_inactive() to force dequeue sched_delayed tasks") since all
existing cases would re-enqueue the task if task_on_rq_queued() returned
true and the task would eventually be blocked at pick after the
hierarchy was unthrottled.
wait_task_inactive() is special as it expects the delayed task on
throttled hierarchy to reach the blocked state on dequeue but since
__block_task() is never called, task_on_rq_queued() continues to return
true. Furthermore, since the task is now off the hierarchy, the pick
never reaches it to fully block the task even after unthrottle leading
to wait_task_inactive() looping endlessly.
Remedy this by calling __block_task() if a delayed task is being
dequeued on a throttled hierarchy.
This fix is only required for stabled kernels implementing delay dequeue
(>= v6.12) before v6.18 since upstream commit e1fad12dcb66 ("sched/fair:
Switch to task based throttle model") indirectly fixes this by removing
the early return conditions in dequeue_entities() as part of the per-task
throttle feature.
Cc: stable@...r.kernel.org
Reported-by: Matt Fleming <matt@...dmodwrite.com>
Closes: https://lore.kernel.org/all/20250925133310.1843863-1-matt@readmodwrite.com/
Fixes: b7ca5743a260 ("sched/core: Tweak wait_task_inactive() to force dequeue sched_delayed tasks")
Tested-by: Matt Fleming <mfleming@...udflare.com>
Signed-off-by: K Prateek Nayak <kprateek.nayak@....com>
---
Hello Greg, Sasha,
Please consider the same fix for the v6.17 stable kernel too since there
is a report of a similar issue on v6.17.1 based RT kernel at
https://lore.kernel.org/lkml/aPN7XBJbGhdWJDb2@uudg.org/ and Luis
confirmed that this fix solves the issue for him in
https://lore.kernel.org/lkml/aPgm6KvDx5Os2oJS@uudg.org/
---
kernel/sched/fair.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8ce56a8d507f..f0a4d9d7424d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6969,6 +6969,7 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
int h_nr_runnable = 0;
struct cfs_rq *cfs_rq;
u64 slice = 0;
+ int ret = 0;
if (entity_is_task(se)) {
p = task_of(se);
@@ -6998,7 +6999,7 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
/* end evaluation on encountering a throttled cfs_rq */
if (cfs_rq_throttled(cfs_rq))
- return 0;
+ goto out;
/* Don't dequeue parent if it has other entities besides us */
if (cfs_rq->load.weight) {
@@ -7039,7 +7040,7 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
/* end evaluation on encountering a throttled cfs_rq */
if (cfs_rq_throttled(cfs_rq))
- return 0;
+ goto out;
}
sub_nr_running(rq, h_nr_queued);
@@ -7048,6 +7049,8 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
if (unlikely(!was_sched_idle && sched_idle_rq(rq)))
rq->next_balance = jiffies;
+ ret = 1;
+out:
if (p && task_delayed) {
WARN_ON_ONCE(!task_sleep);
WARN_ON_ONCE(p->on_rq != 1);
@@ -7063,7 +7066,7 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
__block_task(rq, p);
}
- return 1;
+ return ret;
}
/*
base-commit: 6c7871823908a4330e145d635371582f76ce1407
--
2.34.1
Powered by blists - more mailing lists