linux-kernel - Re: [RFC PATCH 3/4] sched/topology: remove smt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ca95124f-db09-c38f-d96a-5240bfe8c024@arm.com>
Date:   Wed, 29 Aug 2018 15:08:38 +0100
From:   Qais Yousef <qais.yousef@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>, peterz@...radead.org,
        mingo@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 3/4] sched/topology: remove smt_gain

On 29/08/18 14:19, Vincent Guittot wrote:
> smt_gain is used to compute the capacity of CPUs of a SMT core with the
> constraint 1 < smt_gain < 2 in order to be able to compute number of CPUs
> per core. The field has_free_capacity of struct numa_stat, which was the
> last user of this computation of number of CPUs per core, has been removed
> by :
>
>    commit 2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")
>
> We can now remove this constraint on core capacity and use the defautl value
> SCHED_CAPACITY_SCALE for SMT CPUs. With this remove, SCHED_CAPACITY_SCALE
> becomes the maximum compute capacity of CPUs on every systems. This should
> help to simplify some code and remove fields like rd->max_cpu_capacity
>
> Furthermore, arch_scale_cpu_capacity() is used with a NULL sd in several other
> places in the code when it wants the capacity of a CPUs to scale
> some metrics like in pelt, deadline or schedutil. In case on SMT, the value
> returned is not the capacity of SMT CPUs but default SCHED_CAPACITY_SCALE.
>
> Remove the smt_gain field from sched_domain struct

You beat me to it, I got confused by smt_gain recently when I stumbled 
on it as I found out on ARM it's not used and had to spend sometime to 
convince myself it's not really necessary to use it.

It was hard to track the history of this and *why* it's needed.

The only 'theoretical' case I found smt_gain can be useful is when you 
have asymmetric system, for example:

Node_A: 1 Core 2 Threads
Node_B: 1 Core 4 Threads

Then with smt_gain the group_capacity at the core level will be limited 
to smt_gain. But without smt_gain Node_B will look twice as powerful as 
Node_A - which will affect balancing AFAICT causing Node_B's single core 
to be oversubscribed as the 4 threads will still have to share the same 
underlying hardware resources. I don't think in practice such systems 
exists, or even make sense, though.

So +1 from my side for the removal.

> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: linux-kernel@...r.kernel.org (open list)
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> ---
>   include/linux/sched/topology.h | 1 -
>   kernel/sched/sched.h           | 3 ---
>   kernel/sched/topology.c        | 2 --
>   3 files changed, 6 deletions(-)
>
> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> index 2634774..212792b 100644
> --- a/include/linux/sched/topology.h
> +++ b/include/linux/sched/topology.h
> @@ -89,7 +89,6 @@ struct sched_domain {
>   	unsigned int newidle_idx;
>   	unsigned int wake_idx;
>   	unsigned int forkexec_idx;
> -	unsigned int smt_gain;
>   
>   	int nohz_idle;			/* NOHZ IDLE status */
>   	int flags;			/* See SD_* */
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 4a2e8ca..b1715b8 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1758,9 +1758,6 @@ unsigned long arch_scale_freq_capacity(int cpu)
>   static __always_inline
>   unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
>   {
> -	if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
> -		return sd->smt_gain / sd->span_weight;
> -
>   	return SCHED_CAPACITY_SCALE;
>   }
>   #endif
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 56a0fed..069c924 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1129,7 +1129,6 @@ sd_init(struct sched_domain_topology_level *tl,
>   
>   		.last_balance		= jiffies,
>   		.balance_interval	= sd_weight,
> -		.smt_gain		= 0,
>   		.max_newidle_lb_cost	= 0,
>   		.next_decay_max_lb_cost	= jiffies,
>   		.child			= child,
> @@ -1155,7 +1154,6 @@ sd_init(struct sched_domain_topology_level *tl,
>   	if (sd->flags & SD_SHARE_CPUCAPACITY) {
>   		sd->flags |= SD_PREFER_SIBLING;
>   		sd->imbalance_pct = 110;
> -		sd->smt_gain = 1178; /* ~15% */
>   
>   	} else if (sd->flags & SD_SHARE_PKG_RESOURCES) {
>   		sd->flags |= SD_PREFER_SIBLING;


-- 
Qais Yousef