linux-kernel - Re: [PATCH v3] sched_ext: Introduce NUMA awareness to the default idle selection policy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <Zx6LyQ-kIe0fXSrL@gpd3>
Date: Sun, 27 Oct 2024 19:51:53 +0100
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] sched_ext: Introduce NUMA awareness to the default
 idle selection policy

On Sun, Oct 27, 2024 at 08:03:53AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Sun, Oct 27, 2024 at 06:49:53PM +0100, Andrea Righi wrote:
> ...
> > +static void update_selcpu_topology(void)
> >  {
> > +     bool enable_llc = false, enable_numa = false;
> > +     s32 cpu;
> >
> > +     rcu_read_lock();
> > +     for_each_possible_cpu(cpu) {
> > +             struct sched_domain *sd;
> > +             const struct cpumask *cpus;
> > +
> > +             sd = rcu_dereference(per_cpu(sd_llc, cpu));
> > +             if (sd) {
> > +                     cpus = sched_domain_span(sd);
> > +                     pr_debug("sched_ext: LLC cpu%d: %*pbl\n",
> > +                              cpu, cpumask_pr_args(cpus));
> > +                     if (cpumask_weight(cpus) < num_possible_cpus())
> > +                             enable_llc = true;
> > +             }
> > +
> > +             sd = highest_flag_domain(cpu, SD_NUMA);
> > +             if (sd) {
> > +                     cpus = sched_group_span(sd->groups);
> > +                     pr_debug("sched_ext: NUMA cpu%d: %*pbl\n",
> > +                              cpu, cpumask_pr_args(cpus));
> > +                     if (cpumask_weight(cpus) < num_possible_cpus())
> > +                             enable_numa = true;
> 
> This isn't a big problem but the loop looks a bit odd in that once both are
> set, it's obvious that there's no reason to continue. Doesn't this need to
> check only one CPU in fact? e.g. for the first possible CPU, if sd doesn't
> exist or cpumask_weight(sd) == num_possible_cpu(), don't we know for sure
> that llc or numa doesn't need to be enabled and vice-versa?

I was also wondering about this, but I wasn't sure if you can create
non-standard NUMA configurations (like assigning a CPU to multiple
heterogeneous NUMA node), for example with virtualization or NUMA
emulation.

I just checked with qemu and it seems that it doesn't allow to assign a
CPU to multiple NUMA nodes, so maybe it's a valid assumptions that we
won't have unusual overlapping NUMA configurations.

With this assumption, yes, we can just check the first CPU, and it's
probably reasonable to do so anyway.

> 
> >  static s32 scx_select_cpu_dfl(struct task_struct *p, s32 prev_cpu,
> >                             u64 wake_flags, bool *found)
> >  {
> > +     const struct cpumask *llc_cpus = NULL;
> > +     const struct cpumask *numa_cpus = NULL;
> >       s32 cpu;
> >
> >       *found = false;
> ...
> > +     if (p->nr_cpus_allowed >= num_possible_cpus()) {
> > +             if (static_branch_unlikely(&scx_selcpu_topo_numa))
> > +                     numa_cpus = cpumask_of_node(cpu_to_node(prev_cpu));
> > +
> > +             if (static_branch_unlikely(&scx_selcpu_topo_llc)) {
> 
> static_branch_maybe() is probably the better one for llc.

Ah, definitely!

Thanks! Will apply these changes and send a new patch.

-Andrea