[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251025161113.14313-1-atomlin@atomlin.com>
Date: Sat, 25 Oct 2025 12:11:13 -0400
From: Aaron Tomlin <atomlin@...mlin.com>
To: mingo@...hat.com,
peterz@...radead.org,
juri.lelli@...hat.com,
vincent.guittot@...aro.org,
dietmar.eggemann@....com,
rostedt@...dmis.org,
bsegall@...gle.com,
mgorman@...e.de,
vschneid@...hat.com
Cc: atomlin@...mlin.com,
pauld@...hat.com,
linux-kernel@...r.kernel.org
Subject: [PATCH] sched/isolation: Enforce at least one housekeeping CPU per node unless maxcpus limits
This patch improves housekeeping CPU selection logic by enforcing that
each online NUMA node has at least one dedicated housekeeping CPU,
ensuring better NUMA locality for kernel threads and timed work.
Before assigning additional housekeeping CPUs, the patch checks if any
online NUMA node contains CPUs with logical IDs greater than or equal to
max_cpus=. If so, per-node NUMA enforcement is skipped and a
warning is issued, since some nodes would be unserviceable given the CPU
limit.
If NUMA enforcement is possible, each online node lacking a housekeeping
CPU will have one present CPU (the lowest logical ID) assigned and
included in the housekeeping staging mask, with a warning logged for
visibility. The final guarantee that at least one present housekeeping
CPU is assigned across the system remains intact.
Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
---
kernel/sched/isolation.c | 40 ++++++++++++++++++++++++++++++++++++++--
1 file changed, 38 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index a4cf17b1fab0..87b7f20d76b1 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -114,8 +114,10 @@ static void __init housekeeping_setup_type(enum hk_type type,
static int __init housekeeping_setup(char *str, unsigned long flags)
{
cpumask_var_t non_housekeeping_mask, housekeeping_staging;
- unsigned int first_cpu;
- int err = 0;
+ const struct cpumask *node_cpus;
+ unsigned int first_cpu, last_cpu;
+ int node, node_cpu, err = 0;
+ bool skip_numa_enforcement = false;
if ((flags & HK_FLAG_KERNEL_NOISE) && !(housekeeping.flags & HK_FLAG_KERNEL_NOISE)) {
if (!IS_ENABLED(CONFIG_NO_HZ_FULL)) {
@@ -135,6 +137,40 @@ static int __init housekeeping_setup(char *str, unsigned long flags)
cpumask_andnot(housekeeping_staging,
cpu_possible_mask, non_housekeeping_mask);
+ for_each_online_node(node) {
+ node_cpus = cpumask_of_node(node);
+
+ if (cpumask_empty(node_cpus))
+ continue;
+
+ last_cpu = cpumask_last(node_cpus);
+ if (last_cpu >= setup_max_cpus) {
+ skip_numa_enforcement = true;
+ pr_warn("Housekeeping: NUMA node %d has CPU %d >= "
+ "max_cpus=%d. Skipping NUMA enforcement\n",
+ node, last_cpu, setup_max_cpus);
+ break;
+ }
+ }
+
+ if (!skip_numa_enforcement) {
+ for_each_online_node(node) {
+ node_cpus = cpumask_of_node(node);
+
+ if (cpumask_intersects(node_cpus, housekeeping_staging))
+ continue;
+
+ for_each_cpu_and(node_cpu, node_cpus, cpu_present_mask) {
+ pr_warn("Housekeeping: Adding CPU %d "
+ "from node %d to ensure NUMA "
+ "coverage\n", node_cpu, node);
+ __cpumask_set_cpu(node_cpu, housekeeping_staging);
+ __cpumask_clear_cpu(node_cpu, non_housekeeping_mask);
+ break;
+ }
+ }
+ }
+
first_cpu = cpumask_first_and(cpu_present_mask, housekeeping_staging);
if (first_cpu >= nr_cpu_ids || first_cpu >= setup_max_cpus) {
__cpumask_set_cpu(smp_processor_id(), housekeeping_staging);
--
2.51.0
Powered by blists - more mailing lists