[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zxs3x5EUMQCQkpJX@gpd3>
Date: Fri, 25 Oct 2024 08:16:39 +0200
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched_ext: Introduce NUMA awareness to the default idle
selection policy
On Thu, Oct 24, 2024 at 09:15:58AM -1000, Tejun Heo wrote:
...
> > @@ -3156,7 +3210,8 @@ static inline const struct cpumask *llc_domain(struct task_struct *p, s32 cpu)
> > static s32 scx_select_cpu_dfl(struct task_struct *p, s32 prev_cpu,
> > u64 wake_flags, bool *found)
> > {
> > - const struct cpumask *llc_cpus = llc_domain(p, prev_cpu);
> > + const struct cpumask *llc_cpus = scx_domain(p, prev_cpu, SCX_DOM_LLC);
> > + const struct cpumask *numa_cpus = scx_domain(p, prev_cpu, SCX_DOM_NUMA);
>
> This feels like a lot of code which can just be:
>
> const struct cpumask *llc_cpus = NULL, *numa_cpus = NULL;
>
> #ifdef CONFIG_SCHED_MC
> llc_cpus = rcu_dereference(per_cpu(sd_llc, cpu));
> numa_cpus = rcu_dereference(per_cpu(sd_numa, cpu));
> #endif
>
Yeah, I can definitely simplify this part and get rid of some
boilerplate.
> > s32 cpu;
> >
> > *found = false;
> > @@ -3226,6 +3281,15 @@ static s32 scx_select_cpu_dfl(struct task_struct *p, s32 prev_cpu,
> > goto cpu_found;
> > }
> >
> > + /*
> > + * Search for any fully idle core in the same NUMA node.
> > + */
> > + if (numa_cpus) {
> > + cpu = scx_pick_idle_cpu(numa_cpus, SCX_PICK_IDLE_CORE);
> > + if (cpu >= 0)
> > + goto cpu_found;
> > + }
>
> I'm not convinced about the argument that always doing extra pick is
> beneficial. Sure, the overhead is minimal but isn't it also trivial to avoid
> by just testing llc_cpus == numa_cpus (they resolve to the same cpumasks on
> non-NUMA machines, right)? Taking a step further, the topology information
> is really static and can be determined during boot. Wouldn't it make more
> sense to just skip the unnecessary steps depending on topology? I'm not sure
> the difference would be measurable but if you care we can make them
> static_keys too.
Right, on non-NUMA machines llc_cpus and numa_cpus both resolve to the
same CPUs. Also, on systems with a single shared LLC, llc_cpus and
numa_cpus resolve to all CPUs, so in this case we can skip both steps.
Maybe using static_keys is the best, in this way we're sure that we
won't add any overhead in non-NUMA / single-LLC systems compared to the
previous scx_select_cpu_dfl() implementation.
I'll do some tests and send a v2.
Thanks for looking at this!
-Andrea
Powered by blists - more mailing lists