linux-kernel - Re: [PATCH 3/5] sched/fair: rework load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20190731134355.GC11365@linux.vnet.ibm.com>
Date:   Wed, 31 Jul 2019 19:13:55 +0530
From:   Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Quentin Perret <quentin.perret@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <Morten.Rasmussen@....com>,
        Phil Auld <pauld@...hat.com>
Subject: Re: [PATCH 3/5] sched/fair: rework load_balance

* Vincent Guittot <vincent.guittot@...aro.org> [2019-07-26 16:42:53]:

> On Fri, 26 Jul 2019 at 15:59, Srikar Dronamraju
> <srikar@...ux.vnet.ibm.com> wrote:
> > > @@ -7361,19 +7357,46 @@ static int detach_tasks(struct lb_env *env)
> > >               if (!can_migrate_task(p, env))
> > >                       goto next;
> > >
> > > -             load = task_h_load(p);
> > > +             if (env->src_grp_type == migrate_load) {
> > > +                     unsigned long load = task_h_load(p);
> > >
> > > -             if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
> > > -                     goto next;
> > > +                     if (sched_feat(LB_MIN) &&
> > > +                         load < 16 && !env->sd->nr_balance_failed)
> > > +                             goto next;
> > > +
> > > +                     if ((load / 2) > env->imbalance)
> > > +                             goto next;
> >
> > I know this existed before too but if the load is exactly or around 2x of
> > env->imbalance, the resultant imbalance after the load balance operation
> > would still be around env->imbalance. We may lose some cache affinity too.
> >
> > Can we do something like.
> >                 if (2 * load > 3 * env->imbalance)
> >                         goto next;
> 
> TBH, I don't know what should be the best value and it's probably
> worth doing some investigation but i would prefer to do that as a
> separate patch to get a similar behavior in the overloaded case
> Why do you propose 3/2 instead of 2 ?
> 

If the imbalance is exactly or around load/2, then we still select the task to migrate
However after the migrate the imbalance will still be load/2.
- Can this lead to ping/pong?
- Did we lose out of cache though we didn't gain from an imbalance.

> > >
> > >
> > > -     if (sgs->sum_h_nr_running)
> > > -             sgs->load_per_task = sgs->group_load / sgs->sum_h_nr_running;
> > > +     sgs->group_capacity = group->sgc->capacity;
> > >
> > >       sgs->group_weight = group->group_weight;
> > >
> > > -     sgs->group_no_capacity = group_is_overloaded(env, sgs);
> > > -     sgs->group_type = group_classify(group, sgs);
> > > +     sgs->group_type = group_classify(env, group, sgs);
> > > +
> > > +     /* Computing avg_load makes sense only when group is overloaded */
> > > +     if (sgs->group_type != group_overloaded)
> > > +             sgs->avg_load = (sgs->group_load*SCHED_CAPACITY_SCALE) /
> > > +                             sgs->group_capacity;
> >
> > Mismatch in comment and code?
> 
> I may need to add more comments but at this step, the group should be
> either overloaded or fully busy but it can also be imbalanced.
> In case of a group fully busy or imbalanced (sgs->group_type !=
> group_overloaded), we haven't computed avg_load yet so we have to do
> so because:
> -In the case of fully_busy, we are going to be overloaded which the
> next step after fully busy when you are about to pull more load
> -In case of imbalance, we don't know the real state of the local group
> so we fall back to this default behavior
> 

We seem to be checking for avg_load when the group_type is group_overloaded.
But somehow I am don't see where sgs->avg_load is calculated for
group_overloaded case.

> >
> > We calculated avg_load for !group_overloaded case, but seem to be using for
> > group_overloaded cases too.
> 
> for group_overloaded case, we already computed it in update_sg_lb_stats()
> 


>From update_sg_lb_stats()

	if (sgs->group_type != group_overloaded)
		sgs->avg_load = (sgs->group_load*SCHED_CAPACITY_SCALE) /
				sgs->group_capacity;

So we seem to be skipping calculation of avg_load for group_overloaded. No?


-- 
Thanks and Regards
Srikar Dronamraju