[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZrOjRqdita_dOjk9@yury-ThinkPad>
Date: Wed, 7 Aug 2024 09:39:34 -0700
From: Yury Norov <yury.norov@...il.com>
To: Valentin Schneider <vschneid@...hat.com>
Cc: linux-kernel@...r.kernel.org,
Christophe JAILLET <christophe.jaillet@...adoo.fr>,
Leonardo Bras <leobras@...hat.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>
Subject: Re: [PATCH 2/2] sched/topology: optimize topology_span_sane()
On Wed, Aug 07, 2024 at 03:53:18PM +0200, Valentin Schneider wrote:
> On 06/08/24 11:00, Yury Norov wrote:
> > On Tue, Aug 06, 2024 at 05:50:23PM +0200, Valentin Schneider wrote:
> >> On 02/08/24 10:57, Yury Norov wrote:
> >> > The function may call cpumask_equal with tl->mask(cpu) == tl->mask(i),
> >> > even when cpu != i.
> >>
> >> For which architecture have you observed this? AFAIA all implementations of
> >> tl->sched_domain_mask_f are built on a per-CPU cpumask.
> >
> > x86_64, qemu emulating 16 CPUs in 4 nodes, Linux 6.10, approximately
> > defconfig.
>
> For the default_topology:
> cpu_smt_mask() # SMT
> (per_cpu(cpu_sibling_map, cpu))
>
> cpu_clustergroup_mask() # CLS
> per_cpu(cpu_l2c_shared_map, cpu);
>
> cpu_coregroup_mask() # MC
> per_cpu(cpu_llc_shared_map, cpu);
>
> cpu_cpu_mask() # PKG
> cpumask_of_node(cpu_to_node(cpu));
>
> Ok so PKG can potentially hit that condition, and so can any
> sched_domain_mask_f that relies on the node masks...
>
> I'm thinking ideally we should have checks in place to ensure all
> node_to_cpumask_map[] masks are disjoint, then we could entirely skip the levels
> that use these masks in topology_span_sane(), but there's unfortunately no
> nice way to flag them... Also there would cases where there's no real
> difference between PKG and NODE other than NODE is still based on a per-cpu
> cpumask and PKG isn't, so I don't see a nicer way to go about this.
>
> Please add something like the following to the changelog, and with that:
> Reviewed-by: Valentin Schneider <vschneid@...hat.com>
Sure, will do.
> """
> Some topology levels (e.g. PKG in default_topology[]) have a
> sched_domain_mask_f implementation that reuses the same mask for several
> CPUs (in PKG's case, one mask for all CPUs in the same NUMA node).
>
> For such topology levels, repeating cpumask_equal() checks is wasteful -
> check that the tl->mask(i) pointers aren't the same first.
> """
Powered by blists - more mailing lists