[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190731134355.GC11365@linux.vnet.ibm.com>
Date: Wed, 31 Jul 2019 19:13:55 +0530
From: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Quentin Perret <quentin.perret@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <Morten.Rasmussen@....com>,
Phil Auld <pauld@...hat.com>
Subject: Re: [PATCH 3/5] sched/fair: rework load_balance
* Vincent Guittot <vincent.guittot@...aro.org> [2019-07-26 16:42:53]:
> On Fri, 26 Jul 2019 at 15:59, Srikar Dronamraju
> <srikar@...ux.vnet.ibm.com> wrote:
> > > @@ -7361,19 +7357,46 @@ static int detach_tasks(struct lb_env *env)
> > > if (!can_migrate_task(p, env))
> > > goto next;
> > >
> > > - load = task_h_load(p);
> > > + if (env->src_grp_type == migrate_load) {
> > > + unsigned long load = task_h_load(p);
> > >
> > > - if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
> > > - goto next;
> > > + if (sched_feat(LB_MIN) &&
> > > + load < 16 && !env->sd->nr_balance_failed)
> > > + goto next;
> > > +
> > > + if ((load / 2) > env->imbalance)
> > > + goto next;
> >
> > I know this existed before too but if the load is exactly or around 2x of
> > env->imbalance, the resultant imbalance after the load balance operation
> > would still be around env->imbalance. We may lose some cache affinity too.
> >
> > Can we do something like.
> > if (2 * load > 3 * env->imbalance)
> > goto next;
>
> TBH, I don't know what should be the best value and it's probably
> worth doing some investigation but i would prefer to do that as a
> separate patch to get a similar behavior in the overloaded case
> Why do you propose 3/2 instead of 2 ?
>
If the imbalance is exactly or around load/2, then we still select the task to migrate
However after the migrate the imbalance will still be load/2.
- Can this lead to ping/pong?
- Did we lose out of cache though we didn't gain from an imbalance.
> > >
> > >
> > > - if (sgs->sum_h_nr_running)
> > > - sgs->load_per_task = sgs->group_load / sgs->sum_h_nr_running;
> > > + sgs->group_capacity = group->sgc->capacity;
> > >
> > > sgs->group_weight = group->group_weight;
> > >
> > > - sgs->group_no_capacity = group_is_overloaded(env, sgs);
> > > - sgs->group_type = group_classify(group, sgs);
> > > + sgs->group_type = group_classify(env, group, sgs);
> > > +
> > > + /* Computing avg_load makes sense only when group is overloaded */
> > > + if (sgs->group_type != group_overloaded)
> > > + sgs->avg_load = (sgs->group_load*SCHED_CAPACITY_SCALE) /
> > > + sgs->group_capacity;
> >
> > Mismatch in comment and code?
>
> I may need to add more comments but at this step, the group should be
> either overloaded or fully busy but it can also be imbalanced.
> In case of a group fully busy or imbalanced (sgs->group_type !=
> group_overloaded), we haven't computed avg_load yet so we have to do
> so because:
> -In the case of fully_busy, we are going to be overloaded which the
> next step after fully busy when you are about to pull more load
> -In case of imbalance, we don't know the real state of the local group
> so we fall back to this default behavior
>
We seem to be checking for avg_load when the group_type is group_overloaded.
But somehow I am don't see where sgs->avg_load is calculated for
group_overloaded case.
> >
> > We calculated avg_load for !group_overloaded case, but seem to be using for
> > group_overloaded cases too.
>
> for group_overloaded case, we already computed it in update_sg_lb_stats()
>
>From update_sg_lb_stats()
if (sgs->group_type != group_overloaded)
sgs->avg_load = (sgs->group_load*SCHED_CAPACITY_SCALE) /
sgs->group_capacity;
So we seem to be skipping calculation of avg_load for group_overloaded. No?
--
Thanks and Regards
Srikar Dronamraju
Powered by blists - more mailing lists