[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200116013100.7679-1-richardw.yang@linux.intel.com>
Date: Thu, 16 Jan 2020 09:31:00 +0800
From: Wei Yang <richardw.yang@...ux.intel.com>
To: hannes@...xchg.org, mhocko@...nel.org, vdavydov.dev@...il.com,
akpm@...ux-foundation.org, ktkhai@...tuozzo.com,
kirill.shutemov@...ux.intel.com, yang.shi@...ux.alibaba.com
Cc: cgroups@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, alexander.duyck@...il.com,
rientjes@...gle.com, Wei Yang <richardw.yang@...ux.intel.com>,
stable@...r.kernel.org
Subject: [Patch v3] mm: thp: grab the lock before manipulation defer list
As all the other places, we grab the lock before manipulate the defer list.
Current implementation may face a race condition.
For example, the potential race would be:
CPU1 CPU2
mem_cgroup_move_account deferred_split_huge_page
list_empty
lock
list_empty
list_add_tail
unlock
lock
# list_empty might not hold anymore
list_add_tail
unlock
When this sequence happens, the list_add_tail() in
mem_cgroup_move_account() corrupt the list since which is already been
added to some split_queue in split_huge_page_to_list().
Besides this, David Rientjes points out the split_queue_len would be in
a wrong state, which would be a significant issue for shrinkers.
Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
Signed-off-by: Wei Yang <richardw.yang@...ux.intel.com>
Cc: <stable@...r.kernel.org> [5.4+]
---
v3:
* remove all review/ack tag since rewrite the changelog
* use deferred_split_huge_page as the example of race
* add cc stable 5.4+ tag as suggested by David Rientjes
v2:
* move check on compound outside suggested by Alexander
* an example of the race condition, suggested by Michal
---
mm/memcontrol.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c5b5f74cfd4d..6450bbe394e2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5360,10 +5360,12 @@ static int mem_cgroup_move_account(struct page *page,
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
- if (compound && !list_empty(page_deferred_list(page))) {
+ if (compound) {
spin_lock(&from->deferred_split_queue.split_queue_lock);
- list_del_init(page_deferred_list(page));
- from->deferred_split_queue.split_queue_len--;
+ if (!list_empty(page_deferred_list(page))) {
+ list_del_init(page_deferred_list(page));
+ from->deferred_split_queue.split_queue_len--;
+ }
spin_unlock(&from->deferred_split_queue.split_queue_lock);
}
#endif
@@ -5377,11 +5379,13 @@ static int mem_cgroup_move_account(struct page *page,
page->mem_cgroup = to;
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
- if (compound && list_empty(page_deferred_list(page))) {
+ if (compound) {
spin_lock(&to->deferred_split_queue.split_queue_lock);
- list_add_tail(page_deferred_list(page),
- &to->deferred_split_queue.split_queue);
- to->deferred_split_queue.split_queue_len++;
+ if (list_empty(page_deferred_list(page))) {
+ list_add_tail(page_deferred_list(page),
+ &to->deferred_split_queue.split_queue);
+ to->deferred_split_queue.split_queue_len++;
+ }
spin_unlock(&to->deferred_split_queue.split_queue_lock);
}
#endif
--
2.17.1
Powered by blists - more mailing lists