[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zxv5V5mwDAlGzpBb@slm.duckdns.org>
Date: Fri, 25 Oct 2024 10:02:31 -1000
From: Tejun Heo <tj@...nel.org>
To: Andrea Righi <arighi@...dia.com>
Cc: David Vernet <void@...ifault.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched_ext: Introduce NUMA awareness to the default
idle selection policy
Hello,
On Fri, Oct 25, 2024 at 06:25:35PM +0200, Andrea Righi wrote:
...
> +static DEFINE_STATIC_KEY_FALSE(scx_topology_llc);
> +static DEFINE_STATIC_KEY_FALSE(scx_topology_numa);
Maybe name them sth like scx_selcpu_topo_llc given that this is only used by
selcpu?
> +static void init_topology(void)
Ditto with naming.
> {
> - struct sched_domain *sd = rcu_dereference(per_cpu(sd_llc, cpu));
> - const struct cpumask *llc_cpus = sd ? sched_domain_span(sd) : NULL;
> + const struct cpumask *cpus;
> + int nid;
> + s32 cpu;
> +
> + /*
> + * Detect if the system has multiple NUMA nodes distributed across the
> + * available CPUs and, in that case, enable NUMA-aware scheduling in
> + * the default CPU idle selection policy.
> + */
> + for_each_node(nid) {
> + cpus = cpumask_of_node(nid);
> + if (cpumask_weight(cpus) < nr_cpu_ids) {
Comparing number of cpus with nr_cpu_ids doesn't work. The above condition
can trigger on single node machines with some CPUs offlines or unavailable
for example. I think num_node_state(N_CPU) should work or if you want to
keep with sched_domains, maybe highest_flag_domain(some_cpu,
SD_NUMA)->groups->weight would work?
...
> + for_each_possible_cpu(cpu) {
> + struct sched_domain *sd = rcu_dereference(per_cpu(sd_llc, cpu));
> +
> + if (!sd)
> + continue;
> + cpus = sched_domain_span(sd);
> + if (cpumask_weight(cpus) < nr_cpu_ids) {
Ditto.
...
> + /*
> + * Determine the scheduling domain only if the task is allowed to run
> + * on all CPUs.
> + *
> + * This is done primarily for efficiency, as it avoids the overhead of
> + * updating a cpumask every time we need to select an idle CPU (which
> + * can be costly in large SMP systems), but it also aligns logically:
> + * if a task's scheduling domain is restricted by user-space (through
> + * CPU affinity), the task will simply use the flat scheduling domain
> + * defined by user-space.
> + */
> + if (p->nr_cpus_allowed == nr_cpu_ids) {
Should compare against nr_possible_cpus.
Thanks.
--
tejun
Powered by blists - more mailing lists