[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b8d0fe09-82c4-b263-ce09-6c49ac9fae43@arm.com>
Date: Wed, 21 Sep 2016 18:25:38 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Yuyang Du <yuyang.du@...el.com>,
Morten Rasmussen <Morten.Rasmussen@....com>,
Linaro Kernel Mailman List <linaro-kernel@...ts.linaro.org>,
Paul Turner <pjt@...gle.com>,
Benjamin Segall <bsegall@...gle.com>
Subject: Re: [PATCH 2/7 v3] sched: fix hierarchical order in
rq->leaf_cfs_rq_list
On 21/09/16 13:34, Vincent Guittot wrote:
> Hi Dietmar,
>
> On 21 September 2016 at 12:14, Dietmar Eggemann
> <dietmar.eggemann@....com> wrote:
>> Hi Vincent,
>>
>> On 12/09/16 08:47, Vincent Guittot wrote:
[...]
>> I guess in the meantime we lost the functionality to remove a cfs_rq from the
>> leaf_cfs_rq list once there are no se's enqueued on it anymore. If e.g. t migrates
>> away from Cpu1, all the cfs_rq's of the task hierarchy (for tg_css_id=2,4,6)
>> owned by Cpu1 stay in the leaf_cfs_rq list.
>>
>> Shouldn't we reintegrate it? Following patch goes on top of this patch:
>
> I see one problem: Once a cfs_rq has been removed , it will not be
> periodically updated anymore until the next enqueue. A sleeping task
> is only attached but not enqueued when it moves into a cfs_rq so its
> load is added to cfs_rq's load which have to be periodically
> updated/decayed. So adding a cfs_rq to the list only during an enqueue
> is not enough in this case.
There was more in the original patch (commit 82958366cfea), the removal of the
cfs_rq from the list was only done in case the se->avg.runnable_avg_sum had
decayed to 0. Couldn't we use something similar with se->avg.load_avg instead
to make sure that these blocked contributions have been decayed before we do the
removal?
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4ac1718560e2..3595b0623b37 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6566,6 +6566,8 @@ static void update_blocked_averages(int cpu)
* list_add_leaf_cfs_rq() for details.
*/
for_each_leaf_cfs_rq(rq, cfs_rq) {
+ struct sched_entity *se = cfs_rq->tg->se[cpu];
+
/* throttled entities do not contribute to load */
if (throttled_hierarchy(cfs_rq))
continue;
@@ -6574,8 +6576,12 @@ static void update_blocked_averages(int cpu)
update_tg_load_avg(cfs_rq, 0);
/* Propagate pending load changes to the parent */
- if (cfs_rq->tg->se[cpu])
- update_load_avg(cfs_rq->tg->se[cpu], 0, 0);
+ if (se) {
+ update_load_avg(se, 0, 0);
+
+ if (!se->avg.load_avg && !cfs_rq->nr_running)
+ list_del_leaf_cfs_rq(cfs_rq);
+ }
}
raw_spin_unlock_irqrestore(&rq->lock, flags);
}
> Then, only the highest child level of task group will be removed
> because cfs_rq->nr_running will be > 0 as soon as a child task group
> is created and enqueued into a task group. Or you should use
> cfs.h_nr_running instead i don't know all implications
In my tests all cfs_rq's (tg_id=6,4,2) on the source CPU (CPU1) get removed? Do I miss something?
migration/1-16 [001] 67.235292: sched_migrate_task: comm=task0 pid=2210 prio=120 orig_cpu=1 dest_cpu=2
migration/1-16 [001] 67.235295: bprint: enqueue_task_fair: list_add_leaf_cfs_rq: cpu=2 cfs_rq=0xffff80097550d700 tg_id=6 on_list=0
migration/1-16 [001] 67.235298: bprint: enqueue_task_fair: list_add_leaf_cfs_rq: cpu=2 cfs_rq=0xffff800975903700 tg_id=4 on_list=0
migration/1-16 [001] 67.235300: bprint: enqueue_task_fair: list_add_leaf_cfs_rq: cpu=2 cfs_rq=0xffff800974e0fe00 tg_id=2 on_list=0
migration/1-16 [001] 67.235302: bprint: enqueue_task_fair: list_add_leaf_cfs_rq: cpu=2 cfs_rq=0xffff80097fecb6e0 tg_id=1 on_list=1
migration/1-16 [001] 67.235309: bprint: update_blocked_averages: update_blocked_averages: cpu=1 cfs_rq=0xffff80097550d800 tg_id=6 on_list=1
-> migration/1-16 [001] 67.235312: bprint: update_blocked_averages: list_del_leaf_cfs_rq: cpu=1 cfs_rq=0xffff80097550d800 tg_id=6 on_list=1
migration/1-16 [001] 67.235314: bprint: update_blocked_averages: update_blocked_averages: cpu=1 cfs_rq=0xffff800975903600 tg_id=4 on_list=1
-> migration/1-16 [001] 67.235315: bprint: update_blocked_averages: list_del_leaf_cfs_rq: cpu=1 cfs_rq=0xffff800975903600 tg_id=4 on_list=1
migration/1-16 [001] 67.235316: bprint: update_blocked_averages: update_blocked_averages: cpu=1 cfs_rq=0xffff800974e0f600 tg_id=2 on_list=1
-> migration/1-16 [001] 67.235318: bprint: update_blocked_averages: list_del_leaf_cfs_rq: cpu=1 cfs_rq=0xffff800974e0f600 tg_id=2 on_list=1
migration/1-16 [001] 67.235319: bprint: update_blocked_averages: update_blocked_averages: cpu=1 cfs_rq=0xffff80097feb56e0 tg_id=1 on_list=1
If we don't remove these cfs_rq's, IMHO they stay in the list as long as the tg exists.
[...]
Powered by blists - more mailing lists