[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170918070834.13083-3-mhocko@kernel.org>
Date: Mon, 18 Sep 2017 09:08:34 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Reza Arbab <arbab@...ux.vnet.ibm.com>,
Yasuaki Ishimatsu <yasu.isimatu@...il.com>,
qiuxishi@...wei.com, Igor Mammedov <imammedo@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>, <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
Michal Hocko <mhocko@...e.com>,
Vlastimil Babka <vbabka@...e.cz>
Subject: [PATCH 2/2] mm, memory_hotplug: remove timeout from __offline_memory
From: Michal Hocko <mhocko@...e.com>
We have a hardcoded 120s timeout after which the memory offline fails
basically since the hot remove has been introduced. This is essentially
a policy implemented in the kernel. Moreover there is no way to adjust
the timeout and so we are sometimes facing memory offline failures if
the system is under a heavy memory pressure or very intensive CPU
workload on large machines.
It is not very clear what purpose the timeout actually serves. The
offline operation is interruptible by a signal so if userspace wants
some timeout based termination this can be done trivially by sending a
signal.
If there is a strong usecase to do this from the kernel then we should
do it properly and have a it tunable from the userspace with the timeout
disabled by default along with the explanation who uses it and for what
purporse.
Acked-by: Vlastimil Babka <vbabka@...e.cz>
Signed-off-by: Michal Hocko <mhocko@...e.com>
---
mm/memory_hotplug.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c9dcbe6d2ac6..b8a85c11360e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1593,9 +1593,9 @@ static void node_states_clear_node(int node, struct memory_notify *arg)
}
static int __ref __offline_pages(unsigned long start_pfn,
- unsigned long end_pfn, unsigned long timeout)
+ unsigned long end_pfn)
{
- unsigned long pfn, nr_pages, expire;
+ unsigned long pfn, nr_pages;
long offlined_pages;
int ret, node;
unsigned long flags;
@@ -1633,12 +1633,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
goto failed_removal;
pfn = start_pfn;
- expire = jiffies + timeout;
repeat:
/* start memory hot removal */
- ret = -EBUSY;
- if (time_after(jiffies, expire))
- goto failed_removal;
ret = -EINTR;
if (signal_pending(current))
goto failed_removal;
@@ -1711,7 +1707,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
/* Must be protected by mem_hotplug_begin() or a device_lock */
int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
{
- return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
+ return __offline_pages(start_pfn, start_pfn + nr_pages);
}
#endif /* CONFIG_MEMORY_HOTREMOVE */
--
2.14.1
Powered by blists - more mailing lists