linux-kernel - Re: [RFCv3 PATCH 44/48] sched: Tipping point from energy-aware to conventional load balancing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5511B157.6030200@arm.com>
Date:	Tue, 24 Mar 2015 18:47:51 +0000
From:	Dietmar Eggemann <dietmar.eggemann@....com>
To:	Peter Zijlstra <peterz@...radead.org>,
	Morten Rasmussen <Morten.Rasmussen@....com>
CC:	"mingo@...hat.com" <mingo@...hat.com>,
	"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
	"yuyang.du@...el.com" <yuyang.du@...el.com>,
	"preeti@...ux.vnet.ibm.com" <preeti@...ux.vnet.ibm.com>,
	"mturquette@...aro.org" <mturquette@...aro.org>,
	"nico@...aro.org" <nico@...aro.org>,
	"rjw@...ysocki.net" <rjw@...ysocki.net>,
	Juri Lelli <Juri.Lelli@....com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFCv3 PATCH 44/48] sched: Tipping point from energy-aware to
 conventional load balancing

On 24/03/15 15:26, Peter Zijlstra wrote:
> On Wed, Feb 04, 2015 at 06:31:21PM +0000, Morten Rasmussen wrote:
>> From: Dietmar Eggemann <dietmar.eggemann@....com>
>>
>> Energy-aware load balancing bases on cpu usage so the upper bound of its
>> operational range is a fully utilized cpu. Above this tipping point it
>> makes more sense to use weighted_cpuload to preserve smp_nice.
>> This patch implements the tipping point detection in update_sg_lb_stats
>> as if one cpu is over-utilized the current energy-aware load balance
>> operation will fall back into the conventional weighted load based one.
>>
>> cc: Ingo Molnar <mingo@...hat.com>
>> cc: Peter Zijlstra <peterz@...radead.org>
>>
>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@....com>
>> ---
>>   kernel/sched/fair.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 6b79603..4849bad 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6723,6 +6723,10 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>>   		sgs->sum_weighted_load += weighted_cpuload(i);
>>   		if (idle_cpu(i))
>>   			sgs->idle_cpus++;
>> +
>> +		/* If cpu is over-utilized, bail out of ea */
>> +		if (env->use_ea && cpu_overutilized(i, env->sd))
>> +			env->use_ea = false;
>>   	}
>
> I don't immediately see why this is desired. Why would a single
> overloaded CPU be reason to quit? It could be the cpus simply aren't
> 'balanced' right and the group as a whole is still under utilized.

We want to play it safe here.

E.g. in a >2 cluster system, this over-utilized cpu could run >1 high 
priority tasks on a cluster with energy efficient cpus and this cluster 
could still not be the lb src on DIE level because a not over-utilized 
cluster with less energy-efficient cpus (burning more energy) could be 
chosen instead. We could construct cases where the other cpus in this 
energy efficient cluster can't help the over-utilized cpu during lb on 
MC level.

I can see that using per-cpu data in code which deals w/ sg's is against 
the sd scalability design where we should rely on per-sg and not per-cpu 
data though.

By bailing out in such a scenario we at least guarantee smpnice provided 
by conv. CFS.

We could also favor an sg with an over-utilized cpu to become the src 
but which one do we pick if there're multiple potential src sg's w/ an 
over-utilized cpu?

>
> In that case we want to continue the balance pass to reach this
> equilibrium.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/