[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zq2GJMEl0nG0DMyX@slm.duckdns.org>
Date: Fri, 2 Aug 2024 15:21:40 -1000
From: Tejun Heo <tj@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
kernel-team@...a.com, David Vernet <void@...ifault.com>
Subject: [PATCH tip/sched/core] sched/fair: Make balance_fair() test
sched_fair_runnable() instead of rq->nr_running
balance_fair() skips newidle balancing if rq->nr_running - there are already
tasks on the rq, so no need to try to pull tasks. However, this doesn't seem
correct when bandwidth throttling is in use. When an entity gets throttled,
rq->nr_running is not decremented, so a CPU could end up in a situation
where rq->nr_running is not zero but there are no runnable tasks.
Theoretically, skipping newidle balance in this condition can lead to
increased latencies although I couldn't come up with a scenario where this
could be demonstrated reliably.
Update balance_fair() to use sched_fair_runnable() which tests
rq->cfs.nr_running which is updated by bandwidth throttling. Note that
pick_next_task_fair() already uses sched_fair_runnable() in its optimized
path for the same purpose.
This also makes put_prev_task_balance() avoid incorrectly skipping lower
priority classes' (such as sched_ext) balance(). When a CPU has only lower
priority class tasks, rq->nr_running would still be positive and
balance_fair() would return 1 even when fair doesn't have any tasks to run.
This makes put_prev_task_balance() skip lower priority classes' balance()
incorrectly which may lead to stalls.
Signed-off-by: Tejun Heo <tj@...nel.org>
Reported-by: Peter Zijlstra <peterz@...radead.org>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8323,7 +8323,7 @@ static void set_cpus_allowed_fair(struct
static int
balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
- if (rq->nr_running)
+ if (sched_fair_runnable(rq))
return 1;
return sched_balance_newidle(rq, rf) != 0;
Powered by blists - more mailing lists