linux-kernel - Re: [PATCH v2] sched_ext: Introduce NUMA awareness to the default idle selection policy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zxv5V5mwDAlGzpBb@slm.duckdns.org>
Date: Fri, 25 Oct 2024 10:02:31 -1000
From: Tejun Heo <tj@...nel.org>
To: Andrea Righi <arighi@...dia.com>
Cc: David Vernet <void@...ifault.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sched_ext: Introduce NUMA awareness to the default
 idle selection policy

Hello,

On Fri, Oct 25, 2024 at 06:25:35PM +0200, Andrea Righi wrote:
...
> +static DEFINE_STATIC_KEY_FALSE(scx_topology_llc);
> +static DEFINE_STATIC_KEY_FALSE(scx_topology_numa);

Maybe name them sth like scx_selcpu_topo_llc given that this is only used by
selcpu?

> +static void init_topology(void)

Ditto with naming.

>  {
> -	struct sched_domain *sd = rcu_dereference(per_cpu(sd_llc, cpu));
> -	const struct cpumask *llc_cpus = sd ? sched_domain_span(sd) : NULL;
> +	const struct cpumask *cpus;
> +	int nid;
> +	s32 cpu;
> +
> +	/*
> +	 * Detect if the system has multiple NUMA nodes distributed across the
> +	 * available CPUs and, in that case, enable NUMA-aware scheduling in
> +	 * the default CPU idle selection policy.
> +	 */
> +	for_each_node(nid) {
> +		cpus = cpumask_of_node(nid);
> +		if (cpumask_weight(cpus) < nr_cpu_ids) {

Comparing number of cpus with nr_cpu_ids doesn't work. The above condition
can trigger on single node machines with some CPUs offlines or unavailable
for example. I think num_node_state(N_CPU) should work or if you want to
keep with sched_domains, maybe highest_flag_domain(some_cpu,
SD_NUMA)->groups->weight would work?

...
> +	for_each_possible_cpu(cpu) {
> +		struct sched_domain *sd = rcu_dereference(per_cpu(sd_llc, cpu));
> +
> +		if (!sd)
> +			continue;
> +		cpus = sched_domain_span(sd);
> +		if (cpumask_weight(cpus) < nr_cpu_ids) {

Ditto. 

...
> +	/*
> +	 * Determine the scheduling domain only if the task is allowed to run
> +	 * on all CPUs.
> +	 *
> +	 * This is done primarily for efficiency, as it avoids the overhead of
> +	 * updating a cpumask every time we need to select an idle CPU (which
> +	 * can be costly in large SMP systems), but it also aligns logically:
> +	 * if a task's scheduling domain is restricted by user-space (through
> +	 * CPU affinity), the task will simply use the flat scheduling domain
> +	 * defined by user-space.
> +	 */
> +	if (p->nr_cpus_allowed == nr_cpu_ids) {

Should compare against nr_possible_cpus.

Thanks.

-- 
tejun