linux-kernel - Re: [Patch v2 2/6] sched/topology: Record number of cores in sched group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230612112945.GK4253@hirez.programming.kicks-ass.net>
Date:   Mon, 12 Jun 2023 13:29:45 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Tim Chen <tim.c.chen@...ux.intel.com>
Cc:     Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ricardo Neri <ricardo.neri@...el.com>,
        "Ravi V . Shankar" <ravi.v.shankar@...el.com>,
        Ben Segall <bsegall@...gle.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Len Brown <len.brown@...el.com>, Mel Gorman <mgorman@...e.de>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Valentin Schneider <vschneid@...hat.com>,
        Ionela Voinescu <ionela.voinescu@....com>, x86@...nel.org,
        linux-kernel@...r.kernel.org,
        Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        naveen.n.rao@...ux.vnet.ibm.com,
        Yicong Yang <yangyicong@...ilicon.com>,
        Barry Song <v-songbaohua@...o.com>,
        Chen Yu <yu.c.chen@...el.com>, Hillf Danton <hdanton@...a.com>
Subject: Re: [Patch v2 2/6] sched/topology: Record number of cores in sched
 group

On Thu, Jun 08, 2023 at 03:32:28PM -0700, Tim Chen wrote:
> From: Tim C Chen <tim.c.chen@...ux.intel.com>
> 
> When balancing sibling domains that have different number of cores,
> tasks in respective sibling domain should be proportional to the number
> of cores in each domain. In preparation of implementing such a policy,
> record the number of tasks in a scheduling group.
> 
> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
> ---
>  kernel/sched/sched.h    |  1 +
>  kernel/sched/topology.c | 10 +++++++++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 3d0eb36350d2..5f7f36e45b87 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1860,6 +1860,7 @@ struct sched_group {
>  	atomic_t		ref;
>  
>  	unsigned int		group_weight;
> +	unsigned int		cores;
>  	struct sched_group_capacity *sgc;
>  	int			asym_prefer_cpu;	/* CPU of highest priority in group */
>  	int			flags;
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 6d5628fcebcf..6b099dbdfb39 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu)
>  static void init_sched_groups_capacity(int cpu, struct sched_domain *sd)
>  {
>  	struct sched_group *sg = sd->groups;
> +	struct cpumask *mask = sched_domains_tmpmask2;
>  
>  	WARN_ON(!sg);
>  
>  	do {
> -		int cpu, max_cpu = -1;
> +		int cpu, cores = 0, max_cpu = -1;
>  
>  		sg->group_weight = cpumask_weight(sched_group_span(sg));
>  
> +		cpumask_copy(mask, sched_group_span(sg));
> +		for_each_cpu(cpu, mask) {
> +			cores++;
> +			cpumask_andnot(mask, mask, cpu_smt_mask(cpu));
> +		}
> +		sg->cores = cores;
> +
>  		if (!(sd->flags & SD_ASYM_PACKING))
>  			goto next;

Just a note; not sure we want or can do anything about this, but
consider someone doing partitions like:

	[0,1] [2,3] [3,6]
	[------] [------]

That is, 3 SMT cores, and 2 partitions splitting an SMT core in two.

Then the domain trees will see either 2 or 3 but not the fully core.

I'm perfectly fine with saying: don't do that then.