[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250825072538.GP3245006@noisy.programming.kicks-ass.net>
Date: Mon, 25 Aug 2025 09:25:38 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Ingo Molnar <mingo@...hat.com>,
Vinicius Costa Gomes <vinicius.gomes@...el.com>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
Tim Chen <tim.c.chen@...el.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Libo Chen <libo.chen@...cle.com>,
Abel Wu <wuyun.abel@...edance.com>, Len Brown <len.brown@...el.com>,
linux-kernel@...r.kernel.org, Chen Yu <yu.c.chen@...el.com>,
K Prateek Nayak <kprateek.nayak@....com>,
"Gautham R . Shenoy" <gautham.shenoy@....com>,
Zhao Liu <zhao1.liu@...el.com>
Subject: Re: [PATCH 1/2] sched: topology: Fix topology validation error
On Fri, Aug 22, 2025 at 01:14:14PM -0700, Tim Chen wrote:
> From: Vinicius Costa Gomes <vinicius.gomes@...el.com>
>
> As sd_numa_mask() (the function behind tl->mask() for the NUMA levels
> of the topology) depends on the value of sched_domains_curr_level,
> it's possible to be iterating over a level while, sd_numa_mask()
> thinks we are in another, causing the topology validation to fail (for
> valid cases).
>
> Set sched_domains_curr_level to the current topology level while
> iterating.
>
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@...el.com>
> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
> ---
> kernel/sched/topology.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 977e133bb8a4..9a7ac67e3d63 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2394,6 +2394,14 @@ static bool topology_span_sane(const struct cpumask *cpu_map)
> for_each_sd_topology(tl) {
> int tl_common_flags = 0;
>
> +#ifdef CONFIG_NUMA
> + /*
> + * sd_numa_mask() (one of the possible values of
> + * tl->mask()) depends on the current level to work
> + * correctly.
> + */
This is propagating that ugly hack from sd_init(), isn't it. Except its
pretending like its sane code... And for what?
> + sched_domains_curr_level = tl->numa_level;
> +#endif
> if (tl->sd_flags)
> tl_common_flags = (*tl->sd_flags)();
>
if (tl_common_flags & SD_NUMA)
continue;
So how does this make any difference ?
We should never get to calling tl->mask() for NUMA.
Powered by blists - more mailing lists