linux-kernel - Re: [PATCH 3/5] sched/fair: rework load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b98ae93b-80f7-a4ca-0c4d-9d6c166055a7@arm.com>
Date:   Fri, 26 Jul 2019 15:09:24 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Vincent Guittot <vincent.guittot@...aro.org>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com,
        peterz@...radead.org, quentin.perret@....com,
        dietmar.eggemann@....com, Morten.Rasmussen@....com,
        pauld@...hat.com
Subject: Re: [PATCH 3/5] sched/fair: rework load_balance

On 26/07/2019 14:58, Srikar Dronamraju wrote:
[...]
>> @@ -8357,72 +8318,115 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
>>  	if (busiest->group_type == group_imbalanced) {
>>  		/*
>>  		 * In the group_imb case we cannot rely on group-wide averages
>> -		 * to ensure CPU-load equilibrium, look at wider averages. XXX
>> +		 * to ensure CPU-load equilibrium, try to move any task to fix
>> +		 * the imbalance. The next load balance will take care of
>> +		 * balancing back the system.
>>  		 */
>> -		busiest->load_per_task =
>> -			min(busiest->load_per_task, sds->avg_load);
>> +		env->src_grp_type = migrate_task;
>> +		env->imbalance = 1;
>> +		return;
>>  	}
>>  
>> -	/*
>> -	 * Avg load of busiest sg can be less and avg load of local sg can
>> -	 * be greater than avg load across all sgs of sd because avg load
>> -	 * factors in sg capacity and sgs with smaller group_type are
>> -	 * skipped when updating the busiest sg:
>> -	 */
>> -	if (busiest->group_type != group_misfit_task &&
>> -	    (busiest->avg_load <= sds->avg_load ||
>> -	     local->avg_load >= sds->avg_load)) {
>> -		env->imbalance = 0;
>> -		return fix_small_imbalance(env, sds);
>> +	if (busiest->group_type == group_misfit_task) {
>> +		/* Set imbalance to allow misfit task to be balanced. */
>> +		env->src_grp_type = migrate_misfit;
>> +		env->imbalance = busiest->group_misfit_task_load;
>> +		return;
>>  	}
>>  
>>  	/*
>> -	 * If there aren't any idle CPUs, avoid creating some.
>> +	 * Try to use spare capacity of local group without overloading it or
>> +	 * emptying busiest
>>  	 */
>> -	if (busiest->group_type == group_overloaded &&
>> -	    local->group_type   == group_overloaded) {
>> -		load_above_capacity = busiest->sum_h_nr_running * SCHED_CAPACITY_SCALE;
>> -		if (load_above_capacity > busiest->group_capacity) {
>> -			load_above_capacity -= busiest->group_capacity;
>> -			load_above_capacity *= scale_load_down(NICE_0_LOAD);
>> -			load_above_capacity /= busiest->group_capacity;
>> -		} else
>> -			load_above_capacity = ~0UL;
>> +	if (local->group_type == group_has_spare) {
>> +		long imbalance;
>> +
>> +		/*
>> +		 * If there is no overload, we just want to even the number of
>> +		 * idle cpus.
>> +		 */
>> +		env->src_grp_type = migrate_task;
>> +		imbalance = max_t(long, 0, (local->idle_cpus - busiest->idle_cpus) >> 1);
> 
> Shouldnt this be?
> 		imbalance = max_t(long, 0, (busiest->idle_cpus - local->idle_cpus) >> 1);

I think it's the right way around - if busiest has more idle CPUs than local,
then we shouldn't balance (local is busier than busiest)

However, doesn't that lead to a imbalance of 0 when e.g. local has 1 idle
CPU and busiest has 0 ?. If busiest has more than one task we should try
to pull at least one.