[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1486322541-8206-211-git-send-email-w@1wt.eu>
Date: Sun, 5 Feb 2017 20:22:12 +0100
From: Willy Tarreau <w@....eu>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org,
linux@...ck-us.net
Cc: zhong jiang <zhongjiang@...wei.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Willy Tarreau <w@....eu>
Subject: [PATCH 3.10 310/319] mm,ksm: fix endless looping in allocating memory when ksm enable
From: zhong jiang <zhongjiang@...wei.com>
commit 5b398e416e880159fe55eefd93c6588fa072cd66 upstream.
I hit the following hung task when runing a OOM LTP test case with 4.1
kernel.
Call trace:
[<ffffffc000086a88>] __switch_to+0x74/0x8c
[<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
[<ffffffc000a1c09c>] schedule+0x3c/0x94
[<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
[<ffffffc000a1e32c>] down_write+0x64/0x80
[<ffffffc00021f794>] __ksm_exit+0x90/0x19c
[<ffffffc0000be650>] mmput+0x118/0x11c
[<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
[<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
[<ffffffc0000d0f34>] get_signal+0x444/0x5e0
[<ffffffc000089fcc>] do_signal+0x1d8/0x450
[<ffffffc00008a35c>] do_notify_resume+0x70/0x78
The oom victim cannot terminate because it needs to take mmap_sem for
write while the lock is held by ksmd for read which loops in the page
allocator
ksm_do_scan
scan_get_next_rmap_item
down_read
get_next_rmap_item
alloc_rmap_item #ksmd will loop permanently.
There is no way forward because the oom victim cannot release any memory
in 4.1 based kernel. Since 4.6 we have the oom reaper which would solve
this problem because it would release the memory asynchronously.
Nevertheless we can relax alloc_rmap_item requirements and use
__GFP_NORETRY because the allocation failure is acceptable as ksm_do_scan
would just retry later after the lock got dropped.
Such a patch would be also easy to backport to older stable kernels which
do not have oom_reaper.
While we are at it add GFP_NOWARN so the admin doesn't have to be alarmed
by the allocation failure.
Link: http://lkml.kernel.org/r/1474165570-44398-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@...wei.com>
Suggested-by: Hugh Dickins <hughd@...gle.com>
Suggested-by: Michal Hocko <mhocko@...e.cz>
Acked-by: Michal Hocko <mhocko@...e.com>
Acked-by: Hugh Dickins <hughd@...gle.com>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
Signed-off-by: Willy Tarreau <w@....eu>
---
mm/ksm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/ksm.c b/mm/ksm.c
index 7bf748f..d1b19b9 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -283,7 +283,8 @@ static inline struct rmap_item *alloc_rmap_item(void)
{
struct rmap_item *rmap_item;
- rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL);
+ rmap_item = kmem_cache_zalloc(rmap_item_cache, GFP_KERNEL |
+ __GFP_NORETRY | __GFP_NOWARN);
if (rmap_item)
ksm_rmap_items++;
return rmap_item;
--
2.8.0.rc2.1.gbe9624a
Powered by blists - more mailing lists