linux-kernel - Re: [PATCH v3 04/10] sched/fair: rework load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <ea65955f-8b90-f739-1ede-db2b0e56ce8d@linux.ibm.com>
Date:   Fri, 18 Oct 2019 11:04:06 +0530
From:   Parth Shah <parth@...ux.ibm.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Phil Auld <pauld@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Quentin Perret <quentin.perret@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <Morten.Rasmussen@....com>,
        Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH v3 04/10] sched/fair: rework load_balance



On 10/16/19 5:26 PM, Vincent Guittot wrote:
> On Wed, 16 Oct 2019 at 09:21, Parth Shah <parth@...ux.ibm.com> wrote:
>>
>>
>>
>> On 9/19/19 1:03 PM, Vincent Guittot wrote:
>>
>> [...]
>>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
>>> ---
>>>  kernel/sched/fair.c | 585 ++++++++++++++++++++++++++++++++++------------------
>>>  1 file changed, 380 insertions(+), 205 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 017aad0..d33379c 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -7078,11 +7078,26 @@ static unsigned long __read_mostly max_load_balance_interval = HZ/10;
>>>
>>>  enum fbq_type { regular, remote, all };
>>>
>>> +/*
>>> + * group_type describes the group of CPUs at the moment of the load balance.
>>> + * The enum is ordered by pulling priority, with the group with lowest priority
>>> + * first so the groupe_type can be simply compared when selecting the busiest
>>> + * group. see update_sd_pick_busiest().
>>> + */
>>>  enum group_type {
>>> -     group_other = 0,
>>> +     group_has_spare = 0,
>>> +     group_fully_busy,
>>>       group_misfit_task,
>>> +     group_asym_packing,
>>>       group_imbalanced,
>>> -     group_overloaded,
>>> +     group_overloaded
>>> +};
>>> +
>>> +enum migration_type {
>>> +     migrate_load = 0,
>>> +     migrate_util,
>>> +     migrate_task,
>>> +     migrate_misfit
>>>  };
>>>
>>>  #define LBF_ALL_PINNED       0x01
>>> @@ -7115,7 +7130,7 @@ struct lb_env {
>>>       unsigned int            loop_max;
>>>
>>>       enum fbq_type           fbq_type;
>>> -     enum group_type         src_grp_type;
>>> +     enum migration_type     balance_type;
>>>       struct list_head        tasks;
>>>  };
>>>
>>> @@ -7347,7 +7362,7 @@ static int detach_tasks(struct lb_env *env)
>>>  {
>>>       struct list_head *tasks = &env->src_rq->cfs_tasks;
>>>       struct task_struct *p;
>>> -     unsigned long load;
>>> +     unsigned long util, load;
>>>       int detached = 0;
>>>
>>>       lockdep_assert_held(&env->src_rq->lock);
>>> @@ -7380,19 +7395,53 @@ static int detach_tasks(struct lb_env *env)
>>>               if (!can_migrate_task(p, env))
>>>                       goto next;
>>>
>>> -             load = task_h_load(p);
>>> +             switch (env->balance_type) {
>>> +             case migrate_load:
>>> +                     load = task_h_load(p);
>>>
>>> -             if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
>>> -                     goto next;
>>> +                     if (sched_feat(LB_MIN) &&
>>> +                         load < 16 && !env->sd->nr_balance_failed)
>>> +                             goto next;
>>>
>>> -             if ((load / 2) > env->imbalance)
>>> -                     goto next;
>>> +                     if ((load / 2) > env->imbalance)
>>> +                             goto next;
>>> +
>>> +                     env->imbalance -= load;
>>> +                     break;
>>> +
>>> +             case migrate_util:
>>> +                     util = task_util_est(p);
>>> +
>>> +                     if (util > env->imbalance)
>>
>> Can you please explain what would happen for
>> `if (util/2 > env->imbalance)` ?
>> just like when migrating load, even util shouldn't be migrated if
>> env->imbalance is just near the utilization of the task being moved, isn't it?
> 
> I have chosen uti and not util/2 to be conservative because
> migrate_util is used to fill spare capacity.
> With `if (util/2 > env->imbalance)`, we can more easily overload the
> local group or pick too much utilization from the overloaded group.
> 

fair enough. I missed the point that unlike migrate_load, with
migrate_util, env->imbalance is just spare capacity of the local group.

Thanks,
Parth

>>
>>> +                             goto next;
>>> +
>>> +                     env->imbalance -= util;
>>> +                     break;
>>> +[ ... ]
>>
>> Thanks,
>> Parth
>>