[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211220114323.22811-3-vincent.donnefort@arm.com>
Date: Mon, 20 Dec 2021 12:43:22 +0100
From: Vincent Donnefort <vincent.donnefort@....com>
To: peterz@...radead.org, mingo@...hat.com, vincent.guittot@...aro.org
Cc: linux-kernel@...r.kernel.org, dietmar.eggemann@....com,
valentin.schneider@....com, morten.rasmussen@....com,
qperret@...gle.com, Vincent Donnefort <vincent.donnefort@....com>
Subject: [PATCH 2/3] sched/fair: Fix newidle_balance() for overutilized systems
On Energy-Aware Scheduling systems, load balancing is disabled in favor of
energy based placement, until one of the CPU is identified as being
overutilized. Once the overutilization is resolved, two paths can lead to
marking the system as non overutilized again:
* load_balance() triggered from newidle_balance().
* load_balance() triggered from the scheduler tick.
However, small caveat for each of those paths. newidle_balance() needs
rd->overload set to run load_balance(), while the load_balance() triggered
by the scheduler tick needs to run from the first idle CPU of the root
domain (see should_we_balance()).
Overutilized can be triggered without setting overload (this can happen
for a CPU which had a misfit task but didn't had its util_avg updated
yet). Then, only the scheduler tick could help to reset overutilized...
but if most of the CPUs are idle, it is very unlikely load_balance() would
run on the only CPU which can reset the flag. This means the root domain
can spuriously maintain overutilized for a long period of time.
We then need newidle_balance() to proceed with balancing if the system is
overutilized.
Fixes: 2802bf3cd936 ("sched/fair: Add over-utilization/tipping point indicator")
Signed-off-by: Vincent Donnefort <vincent.donnefort@....com>
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e2f6fa14e5e7..51f6f55abb37 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10849,7 +10849,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
rcu_read_lock();
sd = rcu_dereference_check_sched_domain(this_rq->sd);
- if (!READ_ONCE(this_rq->rd->overload) ||
+ if ((!READ_ONCE(this_rq->rd->overload) &&
+ !READ_ONCE(this_rq->rd->overutilized)) ||
(sd && this_rq->avg_idle < sd->max_newidle_lb_cost)) {
if (sd)
--
2.25.1
Powered by blists - more mailing lists