[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230612112945.GK4253@hirez.programming.kicks-ass.net>
Date: Mon, 12 Jun 2023 13:29:45 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Ricardo Neri <ricardo.neri@...el.com>,
"Ravi V . Shankar" <ravi.v.shankar@...el.com>,
Ben Segall <bsegall@...gle.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Len Brown <len.brown@...el.com>, Mel Gorman <mgorman@...e.de>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Steven Rostedt <rostedt@...dmis.org>,
Valentin Schneider <vschneid@...hat.com>,
Ionela Voinescu <ionela.voinescu@....com>, x86@...nel.org,
linux-kernel@...r.kernel.org,
Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
naveen.n.rao@...ux.vnet.ibm.com,
Yicong Yang <yangyicong@...ilicon.com>,
Barry Song <v-songbaohua@...o.com>,
Chen Yu <yu.c.chen@...el.com>, Hillf Danton <hdanton@...a.com>
Subject: Re: [Patch v2 2/6] sched/topology: Record number of cores in sched
group
On Thu, Jun 08, 2023 at 03:32:28PM -0700, Tim Chen wrote:
> From: Tim C Chen <tim.c.chen@...ux.intel.com>
>
> When balancing sibling domains that have different number of cores,
> tasks in respective sibling domain should be proportional to the number
> of cores in each domain. In preparation of implementing such a policy,
> record the number of tasks in a scheduling group.
>
> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
> ---
> kernel/sched/sched.h | 1 +
> kernel/sched/topology.c | 10 +++++++++-
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 3d0eb36350d2..5f7f36e45b87 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1860,6 +1860,7 @@ struct sched_group {
> atomic_t ref;
>
> unsigned int group_weight;
> + unsigned int cores;
> struct sched_group_capacity *sgc;
> int asym_prefer_cpu; /* CPU of highest priority in group */
> int flags;
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 6d5628fcebcf..6b099dbdfb39 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu)
> static void init_sched_groups_capacity(int cpu, struct sched_domain *sd)
> {
> struct sched_group *sg = sd->groups;
> + struct cpumask *mask = sched_domains_tmpmask2;
>
> WARN_ON(!sg);
>
> do {
> - int cpu, max_cpu = -1;
> + int cpu, cores = 0, max_cpu = -1;
>
> sg->group_weight = cpumask_weight(sched_group_span(sg));
>
> + cpumask_copy(mask, sched_group_span(sg));
> + for_each_cpu(cpu, mask) {
> + cores++;
> + cpumask_andnot(mask, mask, cpu_smt_mask(cpu));
> + }
> + sg->cores = cores;
> +
> if (!(sd->flags & SD_ASYM_PACKING))
> goto next;
Just a note; not sure we want or can do anything about this, but
consider someone doing partitions like:
[0,1] [2,3] [3,6]
[------] [------]
That is, 3 SMT cores, and 2 partitions splitting an SMT core in two.
Then the domain trees will see either 2 or 3 but not the fully core.
I'm perfectly fine with saying: don't do that then.
Powered by blists - more mailing lists