[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pmnb3ccr.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Fri, 25 Feb 2022 10:32:20 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Abhishek Goel <huntbag@...ux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Zi Yan <ziy@...dia.com>,
David Hildenbrand <david@...hat.com>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH -V11 2/9] mm/migrate: update node demotion order on
hotplug events
Hi, Abhishek,
Abhishek Goel <huntbag@...ux.vnet.ibm.com> writes:
> On 24/02/22 05:35, Dave Hansen wrote:
>> On 2/23/22 15:02, Abhishek Goel wrote:
>>> If needed, I will provide experiment results and traces that were used
>>> to conclude this.
>> It would be great if you can provide some more info. Even just a CPU
>> time profile would be helpful.
>
> Average total time taken for SMT=8 to SMT=1 in v5.14 : 20s
>
> Average total time taken for SMT=8 to SMT=1 in v5.15 : 36s
>
> (Observed in system with 150+ CPUs )
We have run into a memory hotplug regression before. Let's check
whether the problem is similar. Can you try the below debug patch?
Best Regards,
Huang, Ying
----------------------------8<------------------------------------------
>From 500c0b53436b7a697ed5d77241abbc0d5d3cfc07 Mon Sep 17 00:00:00 2001
From: Huang Ying <ying.huang@...el.com>
Date: Wed, 29 Sep 2021 10:57:19 +0800
Subject: [PATCH] mm/migrate: Debug CPU hotplug regression
Signed-off-by: "Huang, Ying" <ying.huang@...el.com>
---
mm/migrate.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index c7da064b4781..c4805f15e616 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -3261,15 +3261,17 @@ static int __meminit migrate_on_reclaim_callback(struct notifier_block *self,
* The ordering is also currently dependent on which nodes have
* CPUs. That means we need CPU on/offline notification too.
*/
-static int migration_online_cpu(unsigned int cpu)
+static int migration_cpu_hotplug(unsigned int cpu)
{
- set_migration_target_nodes();
- return 0;
-}
+ static int nr_cpu_node_saved;
+ int nr_cpu_node;
+
+ nr_cpu_node = num_node_state(N_CPU);
+ if (nr_cpu_node != nr_cpu_node_saved) {
+ set_migration_target_nodes();
+ nr_cpu_node_saved = nr_cpu_node;
+ }
-static int migration_offline_cpu(unsigned int cpu)
-{
- set_migration_target_nodes();
return 0;
}
@@ -3283,7 +3285,7 @@ static int __init migrate_on_reclaim_init(void)
WARN_ON(!node_demotion);
ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_DEAD, "mm/demotion:offline",
- NULL, migration_offline_cpu);
+ NULL, migration_cpu_hotplug);
/*
* In the unlikely case that this fails, the automatic
* migration targets may become suboptimal for nodes
@@ -3292,7 +3294,7 @@ static int __init migrate_on_reclaim_init(void)
*/
WARN_ON(ret < 0);
ret = cpuhp_setup_state(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
- migration_online_cpu, NULL);
+ migration_cpu_hotplug, NULL);
WARN_ON(ret < 0);
hotplug_memory_notifier(migrate_on_reclaim_callback, 100);
--
2.30.2
Powered by blists - more mailing lists