[<prev] [next>] [day] [month] [year] [list]
Message-Id: <1476881123-10159-1-git-send-email-vincent.guittot@linaro.org>
Date: Wed, 19 Oct 2016 14:45:23 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: peterz@...radead.org, mingo@...nel.org,
linux-kernel@...r.kernel.org, dietmar.eggemann@....com,
joseph.salisbury@...onical.com
Cc: joonwoop@...eaurora.org, stable@...r.kernel.org,
Vincent Guittot <vincent.guittot@...aro.org>
Subject: [PATCH] sched: fix wrong task group's load_avg
A regression has been reported with:
commit 3d30544f0212 ("sched/fair: Apply more PELT fixes)
when several level of task groups are involved
and cpu_possible_mask != cpu_present_mask.
The root cause is that group entity's load (tg_child->se[i]->avg.load_avg)
is initialized to scale_load_down(se->load.weight). During the creation of
a child task group, its group entities on possible CPUs are attached to
parent's cfs_rq (tg_parent) and their loads are added in parent's load
(tg_parent->load_avg) with update_tg_load_avg.
But only the load on online CPUs will be then updated to reflect real load
whereas load on other CPUs will stay to the initial value. The result is
a tg_parent->load_avg that is higher than the real load, the weight
of group entities (tg_parent->se[i]->load.weight) on online CPUs is smaller
than it should be, and the task group gets a less running time than what
it could expect.
This situation can be detected with /proc/sched_debug. The ".tg_load_avg"
of the task group will be much higher than sum of ".tg_load_avg_contrib"
of online cfs_rqs of the task group.
The load of group entities don't have to be intialized to something else
than 0 because their load will increase when entity will be attached.
Fixes: 3d30544f0212 ("sched/fair: Apply more PELT fixes)
Reported-by: Joseph Salisbury <joseph.salisbury@...onical.com>
Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: <stable@...r.kernel.org> # 4.8.x
---
kernel/sched/fair.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8b03fb5..89776ac 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -690,7 +690,14 @@ void init_entity_runnable_average(struct sched_entity *se)
* will definitely be update (after enqueue).
*/
sa->period_contrib = 1023;
- sa->load_avg = scale_load_down(se->load.weight);
+ /*
+ * Tasks are intialized with full load to be seen as heavy task until
+ * they get a chance to stabilize to their real load level.
+ * group entity are intialized with null load to reflect the fact that
+ * nothing has been attached yet to the task group.
+ */
+ if (entity_is_task(se))
+ sa->load_avg = scale_load_down(se->load.weight);
sa->load_sum = sa->load_avg * LOAD_AVG_MAX;
/*
* At this point, util_avg won't be used in select_task_rq_fair anyway
--
2.7.4
Powered by blists - more mailing lists