[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b98ae93b-80f7-a4ca-0c4d-9d6c166055a7@arm.com>
Date: Fri, 26 Jul 2019 15:09:24 +0100
From: Valentin Schneider <valentin.schneider@....com>
To: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Vincent Guittot <vincent.guittot@...aro.org>
Cc: linux-kernel@...r.kernel.org, mingo@...hat.com,
peterz@...radead.org, quentin.perret@....com,
dietmar.eggemann@....com, Morten.Rasmussen@....com,
pauld@...hat.com
Subject: Re: [PATCH 3/5] sched/fair: rework load_balance
On 26/07/2019 14:58, Srikar Dronamraju wrote:
[...]
>> @@ -8357,72 +8318,115 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
>> if (busiest->group_type == group_imbalanced) {
>> /*
>> * In the group_imb case we cannot rely on group-wide averages
>> - * to ensure CPU-load equilibrium, look at wider averages. XXX
>> + * to ensure CPU-load equilibrium, try to move any task to fix
>> + * the imbalance. The next load balance will take care of
>> + * balancing back the system.
>> */
>> - busiest->load_per_task =
>> - min(busiest->load_per_task, sds->avg_load);
>> + env->src_grp_type = migrate_task;
>> + env->imbalance = 1;
>> + return;
>> }
>>
>> - /*
>> - * Avg load of busiest sg can be less and avg load of local sg can
>> - * be greater than avg load across all sgs of sd because avg load
>> - * factors in sg capacity and sgs with smaller group_type are
>> - * skipped when updating the busiest sg:
>> - */
>> - if (busiest->group_type != group_misfit_task &&
>> - (busiest->avg_load <= sds->avg_load ||
>> - local->avg_load >= sds->avg_load)) {
>> - env->imbalance = 0;
>> - return fix_small_imbalance(env, sds);
>> + if (busiest->group_type == group_misfit_task) {
>> + /* Set imbalance to allow misfit task to be balanced. */
>> + env->src_grp_type = migrate_misfit;
>> + env->imbalance = busiest->group_misfit_task_load;
>> + return;
>> }
>>
>> /*
>> - * If there aren't any idle CPUs, avoid creating some.
>> + * Try to use spare capacity of local group without overloading it or
>> + * emptying busiest
>> */
>> - if (busiest->group_type == group_overloaded &&
>> - local->group_type == group_overloaded) {
>> - load_above_capacity = busiest->sum_h_nr_running * SCHED_CAPACITY_SCALE;
>> - if (load_above_capacity > busiest->group_capacity) {
>> - load_above_capacity -= busiest->group_capacity;
>> - load_above_capacity *= scale_load_down(NICE_0_LOAD);
>> - load_above_capacity /= busiest->group_capacity;
>> - } else
>> - load_above_capacity = ~0UL;
>> + if (local->group_type == group_has_spare) {
>> + long imbalance;
>> +
>> + /*
>> + * If there is no overload, we just want to even the number of
>> + * idle cpus.
>> + */
>> + env->src_grp_type = migrate_task;
>> + imbalance = max_t(long, 0, (local->idle_cpus - busiest->idle_cpus) >> 1);
>
> Shouldnt this be?
> imbalance = max_t(long, 0, (busiest->idle_cpus - local->idle_cpus) >> 1);
I think it's the right way around - if busiest has more idle CPUs than local,
then we shouldn't balance (local is busier than busiest)
However, doesn't that lead to a imbalance of 0 when e.g. local has 1 idle
CPU and busiest has 0 ?. If busiest has more than one task we should try
to pull at least one.
Powered by blists - more mailing lists