[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87czkctiz9.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Fri, 28 Jan 2022 15:30:50 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Mel Gorman <mgorman@...e.de>, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>, Rik van Riel <riel@...riel.com>
Subject: Re: [RFC PATCH 1/2] NUMA balancing: fix NUMA topology type for
memory tiering system
Srikar Dronamraju <srikar@...ux.vnet.ibm.com> writes:
> * Huang Ying <ying.huang@...el.com> [2022-01-28 10:38:41]:
>
>>
>> One possible fix is to ignore CPU-less nodes when detecting NUMA
>> topology type in init_numa_topology_type(). That works well for the
>> example system. Is it good in general for any system with CPU-less
>> nodes?
>>
>
> A CPUless node at the time online doesn't necessarily mean a CPUless node
> for the entire boot. For example: On PowerVM Lpars, aka powerpc systems,
> some of the nodes may start as CPUless nodes and then CPUS may get
> populated/hotplugged on them.
Got it!
> Hence I am not sure if adding a check for CPUless nodes at node online may
> work for such systems.
How about something as below?
Best Regards,
Huang, Ying
-----------------------8<-----------------------------
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index d201a7052a29..733e8bd930b4 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1737,7 +1737,13 @@ static void init_numa_topology_type(void)
}
for_each_online_node(a) {
+ if (!node_state(a, N_CPU))
+ continue;
+
for_each_online_node(b) {
+ if (!node_state(b, N_CPU))
+ continue;
+
/* Find two nodes furthest removed from each other. */
if (node_distance(a, b) < n)
continue;
@@ -1849,6 +1855,13 @@ void sched_init_numa(void)
sched_domains_numa_masks[i][j] = mask;
+ /*
+ * The mask will be initialized when the first CPU of
+ * the node is onlined.
+ */
+ if (!node_state(j, N_CPU))
+ continue;
+
for_each_node(k) {
/*
* Distance information can be unreliable for
@@ -1919,8 +1932,10 @@ void sched_init_numa(void)
return;
bitmap_zero(sched_numa_onlined_nodes, nr_node_ids);
- for_each_online_node(i)
- bitmap_set(sched_numa_onlined_nodes, i, 1);
+ for_each_online_node(i) {
+ if (node_state(i, N_CPU))
+ bitmap_set(sched_numa_onlined_nodes, i, 1);
+ }
}
static void __sched_domains_numa_masks_set(unsigned int node)
@@ -1928,7 +1943,7 @@ static void __sched_domains_numa_masks_set(unsigned int node)
int i, j;
/*
- * NUMA masks are not built for offline nodes in sched_init_numa().
+ * NUMA masks are not built for offline/CPU-less nodes in sched_init_numa().
* Thus, when a CPU of a never-onlined-before node gets plugged in,
* adding that new CPU to the right NUMA masks is not sufficient: the
* masks of that CPU's node must also be updated.
Powered by blists - more mailing lists