linux-kernel - Re: [PATCH 1/4] sched/fair: Simplify asym

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y05zDJ7ecDJphdFS@hirez.programming.kicks-ass.net>
Date:   Tue, 18 Oct 2022 11:34:04 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
Cc:     Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ricardo Neri <ricardo.neri@...el.com>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Ben Segall <bsegall@...gle.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Len Brown <len.brown@...el.com>, Mel Gorman <mgorman@...e.de>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Valentin Schneider <vschneid@...hat.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, "Tim C . Chen" <tim.c.chen@...el.com>
Subject: Re: [PATCH 1/4] sched/fair: Simplify asym_packing logic for SMT
 sched groups

On Thu, Aug 25, 2022 at 03:55:26PM -0700, Ricardo Neri wrote:
> When the destination CPU is an SMT sibling and idle, it can only help the
> busiest group if all of its other SMT siblings are also idle. Otherwise,
> there is not increase in throughput.
> 
> It does not matter whether the busiest group has SMT siblings. Simply
> check if there are any tasks running on the local group before proceeding.

> Reviewed-by: Len Brown <len.brown@...el.com>
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
> ---
>  kernel/sched/fair.c | 29 +++++++++--------------------
>  1 file changed, 9 insertions(+), 20 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 77b2048a9326..91f271ea02d2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8603,12 +8603,10 @@ static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
>  				    struct sched_group *sg)
>  {
>  #ifdef CONFIG_SCHED_SMT
> -	bool local_is_smt, sg_is_smt;
> +	bool local_is_smt;
>  	int sg_busy_cpus;
>  
>  	local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> -	sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> -
>  	sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
>  
>  	if (!local_is_smt) {
> @@ -8629,25 +8627,16 @@ static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
>  		return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
>  	}
>  
> -	/* @dst_cpu has SMT siblings. */
> -
> -	if (sg_is_smt) {
> -		int local_busy_cpus = sds->local->group_weight -
> -				      sds->local_stat.idle_cpus;
> -		int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> -
> -		if (busy_cpus_delta == 1)
> -			return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> -
> -		return false;
> -	}
> -
>  	/*
> -	 * @sg does not have SMT siblings. Ensure that @sds::local does not end
> -	 * up with more than one busy SMT sibling and only pull tasks if there
> -	 * are not busy CPUs (i.e., no CPU has running tasks).
> +	 * @dst_cpu has SMT siblings. When both @dst_cpu and the busiest core
> +	 * have one or more busy siblings, moving tasks between them results
> +	 * in the same throughput. Only if all the siblings of @dst_cpu are
> +	 * idle throughput can increase.
> +	 *
> +	 * If the difference in the number of busy CPUs is two or more, let
> +	 * find_busiest_group() take care of it.
>  	 */
> -	if (!sds->local_stat.sum_nr_running)
> +	if (sg_busy_cpus == 1 && !sds->local_stat.sum_nr_running)
>  		return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
>  

I can't follow this logic; doesn't this hard assume SMT2 at the very
least? The case for Power7 with SMT8 is that SMT1 is faster than SMT2 is
faster than SMT4, only once you have more than 4 threads active it no
longer matters.