[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtD4kz07hikCuU2_cm67ntruopN9CdJEP+fg5L4_N=qEgg@mail.gmail.com>
Date: Thu, 20 Feb 2020 15:36:02 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Valentin Schneider <valentin.schneider@....com>
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
linux-kernel <linux-kernel@...r.kernel.org>,
Phil Auld <pauld@...hat.com>, Parth Shah <parth@...ux.ibm.com>,
Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH v3 4/5] sched/pelt: Add a new runnable average signal
On Wed, 19 Feb 2020 at 21:10, Valentin Schneider
<valentin.schneider@....com> wrote:
>
> On 19/02/2020 12:55, Vincent Guittot wrote:
> > @@ -740,8 +740,10 @@ void init_entity_runnable_average(struct sched_entity *se)
> > * Group entities are initialized with zero load to reflect the fact that
> > * nothing has been attached to the task group yet.
> > */
> > - if (entity_is_task(se))
> > + if (entity_is_task(se)) {
> > + sa->runnable_avg = SCHED_CAPACITY_SCALE;
>
> So this is a comment that's more related to patch 5, but the relevant bit is
> here. I'm thinking this initialization might be too aggressive wrt load
> balance. This will also give different results between symmetric vs
> asymmetric topologies - a single fork() will make a LITTLE CPU group (at the
> base domain level) overloaded straight away. That won't happen for bigs or on
> symmetric topologies because
>
> // group_is_overloaded()
> sgs->group_capacity * imbalance_pct) < (sgs->group_runnable * 100)
>
> will be false - it would take more than one task for that to happen (due to
> the imbalance_pct).
>
> So maybe what we want here instead is to mimic what he have for utilization,
> i.e. initialize to half the spare capacity of the local CPU. IOW,
> conceptually something like this:
>
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 99249a2484b4..762717092235 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -740,10 +740,8 @@ void init_entity_runnable_average(struct sched_entity *se)
> * Group entities are initialized with zero load to reflect the fact that
> * nothing has been attached to the task group yet.
> */
> - if (entity_is_task(se)) {
> - sa->runnable_avg = SCHED_CAPACITY_SCALE;
> + if (entity_is_task(se))
> sa->load_avg = scale_load_down(se->load.weight);
> - }
>
> /* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
> }
> @@ -796,6 +794,8 @@ void post_init_entity_util_avg(struct task_struct *p)
> }
> }
>
> + sa->runnable_avg = sa->util_avg;
> +
> if (p->sched_class != &fair_sched_class) {
> /*
> * For !fair tasks do:
> ---
>
> The current approach has the merit of giving some sort of hint to the LB
> that there is a bunch of new tasks that it could spread out, but I fear it
> is too aggressive.
I agree that setting by default to SCHED_CAPACITY_SCALE is too much
for little core.
The problem for little core can be fixed by using the cpu capacity instead
@@ -796,6 +794,8 @@ void post_init_entity_util_avg(struct task_struct *p)
}
}
+ sa->runnable_avg = cpu_scale;
+
if (p->sched_class != &fair_sched_class) {
/*
* For !fair tasks do:
>
> > sa->load_avg = scale_load_down(se->load.weight);
> > + }
> >
> > /* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
> > }
Powered by blists - more mailing lists