[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200720083401.22164-1-vincent.guittot@linaro.org>
Date: Mon, 20 Jul 2020 10:34:01 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Cc: valentin.schneider@....com, sashal@...nel.org,
Vincent Guittot <vincent.guittot@...aro.org>
Subject: [PATCH v4.19] sched/fair: handle case of task_h_load() returning 0
[ Upstream commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 ]
task_h_load() can return 0 in some situations like running stress-ng
mmapfork, which forks thousands of threads, in a sched group on a 224 cores
system. The load balance doesn't handle this correctly because
env->imbalance never decreases and it will stop pulling tasks only after
reaching loop_max, which can be equal to the number of running tasks of
the cfs. Make sure that imbalance will be decreased by at least 1.
We can't simply ensure that task_h_load() returns at least one because it
would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
[removed misfit part which was not implemented yet]
Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
Reviewed-by: Valentin Schneider <valentin.schneider@....com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@....com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: <stable@...r.kernel.org> # v4.19 v4.14 v4.9 v4.4
cc: Sasha Levin <sashal@...nel.org>
Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
---
This patch also applies on v4.14.188 v4.9.230 and v4.4.230
kernel/sched/fair.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 92b1e71f13c8..d8c249e6dcb7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7337,7 +7337,15 @@ static int detach_tasks(struct lb_env *env)
if (!can_migrate_task(p, env))
goto next;
- load = task_h_load(p);
+ /*
+ * Depending of the number of CPUs and tasks and the
+ * cgroup hierarchy, task_h_load() can return a null
+ * value. Make sure that env->imbalance decreases
+ * otherwise detach_tasks() will stop only after
+ * detaching up to loop_max tasks.
+ */
+ load = max_t(unsigned long, task_h_load(p), 1);
+
if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
goto next;
--
2.17.1
Powered by blists - more mailing lists