linux-kernel - Re: [PATCH v2] sched/fair: fix 1 task per CPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <acea081e-e9c0-10de-8a66-762a01f15b7b@arm.com>
Date:   Thu, 13 Sep 2018 20:22:03 -0700
From:   Valentin Schneider <valentin.schneider@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>, peterz@...radead.org,
        mingo@...nel.org, linux-kernel@...r.kernel.org
Cc:     Morten.Rasmussen@....com
Subject: Re: [PATCH v2] sched/fair: fix 1 task per CPU

Hi,

On 10/09/18 07:43, Vincent Guittot wrote:
> When CPUs have different capacity because of RT/DL tasks or
> micro-architecture or max frequency differences, there are situation where
> the imbalance is not correctly set to migrate waiting task on the idle CPU.
> 
> The UC uses the force_balance case :
> 	if (env->idle != CPU_NOT_IDLE && group_has_capacity(env, local) &&
> 	    busiest->group_no_capacity)
> 		goto force_balance;
> 
> But calculate_imbalance fails to set the right amount of load to migrate
> a task because of the special condition:
>   busiest->avg_load <= sds->avg_load || local->avg_load >= sds->avg_load)
> 
> Add in fix_small_imbalance, this special case that triggered the force
> balance in order to make sure that the amount of load to migrate will be
> enough.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>

Other than the commit nit, LGTM. Out of curiosity I ran some kernel compile
on my HiKey960 (-j8) but didn't see much change - something along the lines 
of ~1% speedup, and although it was consistent over a few iterations, I'd
need a whole lot more of them to back this up. 

I kind of expected it because some sporadic task can show up and tip the
scale in the right direction, so even without the patch the situation can
"fix itself" eventually, and it becomes less noticeable on really long
workloads.

I do see a difference by looking at the trace of a simple 8 100% tasks rt-app
workload though, as I no longer see that idling LITTLE I sometimes get
without the patch, which is what we expect, so:

Tested-by: Valentin Schneider <valentin.schneider@....com>

> ---

Again, I'd argue for a slightly more explicit header. As you pointed out in
v1, it's not just long running tasks, so maybe just "fix 1 *running* task per
CPU"? Otherwise I feel it's a tad obscure.

>  kernel/sched/fair.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 309c93f..72bc5e8 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8048,6 +8048,20 @@ void fix_small_imbalance(struct lb_env *env, struct sd_lb_stats *sds)
>  	local = &sds->local_stat;
>  	busiest = &sds->busiest_stat;
>  
> +	/*
> +	 * There is available capacity in local group and busiest group is
> +	 * overloaded but calculate_imbalance can't compute the amount of load
> +	 * to migrate because load_avg became meaningless due to asymetric
> +	 * capacity between groups.

Could you add something along the lines of "(see similar condition in
find_busiest_group())"?

In such case, we only want to migrate at
> +	 * least one tasks of the busiest group and rely of the average load
> +	 * per task to ensure the migration.
> +	 */
> +	if (env->idle != CPU_NOT_IDLE && group_has_capacity(env, local) &&
> +	    busiest->group_no_capacity) {
> +		env->imbalance = busiest->load_per_task;
> +		return;
> +	}
> +
>  	if (!local->sum_nr_running)
>  		local->load_per_task = cpu_avg_load_per_task(env->dst_cpu);
>  	else if (busiest->load_per_task > local->load_per_task)
>