linux-kernel - Re: [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161018115651.GA20956@linaro.org>
Date:   Tue, 18 Oct 2016 13:56:51 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Dietmar Eggemann <dietmar.eggemann@....com>,
        Joseph Salisbury <joseph.salisbury@...onical.com>,
        Ingo Molnar <mingo@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Mike Galbraith <efault@....de>, omer.akram@...onical.com
Subject: Re: [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes

Le Tuesday 18 Oct 2016 à 12:34:12 (+0200), Peter Zijlstra a écrit :
> On Tue, Oct 18, 2016 at 11:45:48AM +0200, Vincent Guittot wrote:
> > On 18 October 2016 at 11:07, Peter Zijlstra <peterz@...radead.org> wrote:
> > > So aside from funny BIOSes, this should also show up when creating
> > > cgroups when you have offlined a few CPUs, which is far more common I'd
> > > think.
> > 
> > The problem is also that the load of the tg->se[cpu] that represents
> > the tg->cfs_rq[cpu] is initialized to 1024 in:
> > alloc_fair_sched_group
> >      for_each_possible_cpu(i) {
> >          init_entity_runnable_average(se);
> >             sa->load_avg = scale_load_down(se->load.weight);
> > 
> > Initializing  sa->load_avg to 1024 for a newly created task makes
> > sense as we don't know yet what will be its real load but i'm not sure
> > that we have to do the same for se that represents a task group. This
> > load should be initialized to 0 and it will increase when task will be
> > moved/attached into task group
> 
> Yes, I think that makes sense, not sure how horrible that is with the

That should not be that bad because this initial value is only useful for
the few dozens of ms that follow the creation of the task group 

>
> current state of things, but after your propagate patch, that
> reinstates the interactivity hack that should work for sure.

The patch below fixes the issue on my platform: 

Dietmar, Omer can you confirm that this fix the problem of your platform too ?

---
 kernel/sched/fair.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8b03fb5..89776ac 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -690,7 +690,14 @@ void init_entity_runnable_average(struct sched_entity *se)
 	 * will definitely be update (after enqueue).
 	 */
 	sa->period_contrib = 1023;
-	sa->load_avg = scale_load_down(se->load.weight);
+	/*
+	 * Tasks are intialized with full load to be seen as heavy task until
+	 * they get a chance to stabilize to their real load level.
+	 * group entity are intialized with null load to reflect the fact that
+	 * nothing has been attached yet to the task group.
+	 */
+	if (entity_is_task(se))
+		sa->load_avg = scale_load_down(se->load.weight);
 	sa->load_sum = sa->load_avg * LOAD_AVG_MAX;
 	/*
 	 * At this point, util_avg won't be used in select_task_rq_fair anyway