[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6afdded7-ae70-2412-4f15-f7951164049a@linux.vnet.ibm.com>
Date: Sat, 26 Feb 2022 02:05:45 +0530
From: Abhishek Goel <huntbag@...ux.vnet.ibm.com>
To: "Huang, Ying" <ying.huang@...el.com>
Cc: Dave Hansen <dave.hansen@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Zi Yan <ziy@...dia.com>,
David Hildenbrand <david@...hat.com>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH -V11 2/9] mm/migrate: update node demotion order on
hotplug events
Hi Huang,
On 25/02/22 08:02, Huang, Ying wrote:
>
> We have run into a memory hotplug regression before. Let's check
> whether the problem is similar. Can you try the below debug patch?
>
> Best Regards,
> Huang, Ying
>
> ----------------------------8<------------------------------------------
> From 500c0b53436b7a697ed5d77241abbc0d5d3cfc07 Mon Sep 17 00:00:00 2001
> From: Huang Ying <ying.huang@...el.com>
> Date: Wed, 29 Sep 2021 10:57:19 +0800
> Subject: [PATCH] mm/migrate: Debug CPU hotplug regression
>
> Signed-off-by: "Huang, Ying" <ying.huang@...el.com>
> ---
> mm/migrate.c | 20 +++++++++++---------
> 1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index c7da064b4781..c4805f15e616 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -3261,15 +3261,17 @@ static int __meminit migrate_on_reclaim_callback(struct notifier_block *self,
> * The ordering is also currently dependent on which nodes have
> * CPUs. That means we need CPU on/offline notification too.
> */
> -static int migration_online_cpu(unsigned int cpu)
> +static int migration_cpu_hotplug(unsigned int cpu)
> {
> - set_migration_target_nodes();
> - return 0;
> -}
> + static int nr_cpu_node_saved;
> + int nr_cpu_node;
> +
> + nr_cpu_node = num_node_state(N_CPU);
> + if (nr_cpu_node != nr_cpu_node_saved) {
> + set_migration_target_nodes();
> + nr_cpu_node_saved = nr_cpu_node;
> + }
>
> -static int migration_offline_cpu(unsigned int cpu)
> -{
> - set_migration_target_nodes();
> return 0;
> }
>
> @@ -3283,7 +3285,7 @@ static int __init migrate_on_reclaim_init(void)
> WARN_ON(!node_demotion);
>
> ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_DEAD, "mm/demotion:offline",
> - NULL, migration_offline_cpu);
> + NULL, migration_cpu_hotplug);
> /*
> * In the unlikely case that this fails, the automatic
> * migration targets may become suboptimal for nodes
> @@ -3292,7 +3294,7 @@ static int __init migrate_on_reclaim_init(void)
> */
> WARN_ON(ret < 0);
> ret = cpuhp_setup_state(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
> - migration_online_cpu, NULL);
> + migration_cpu_hotplug, NULL);
> WARN_ON(ret < 0);
>
> hotplug_memory_notifier(migrate_on_reclaim_callback, 100);
This works. Applied this on 5.15 kernel and don't see any regression
compared to 5.14 kernel.
So, Have you posted this patch yet? Or any plans on inclusion of any
similar patch?
Powered by blists - more mailing lists