lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Jan 2020 09:31:00 +0800
From:   Wei Yang <richardw.yang@...ux.intel.com>
To:     hannes@...xchg.org, mhocko@...nel.org, vdavydov.dev@...il.com,
        akpm@...ux-foundation.org, ktkhai@...tuozzo.com,
        kirill.shutemov@...ux.intel.com, yang.shi@...ux.alibaba.com
Cc:     cgroups@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, alexander.duyck@...il.com,
        rientjes@...gle.com, Wei Yang <richardw.yang@...ux.intel.com>,
        stable@...r.kernel.org
Subject: [Patch v3] mm: thp: grab the lock before manipulation defer list

As all the other places, we grab the lock before manipulate the defer list.
Current implementation may face a race condition.

For example, the potential race would be:

    CPU1                      CPU2
    mem_cgroup_move_account   deferred_split_huge_page
      list_empty
                                lock
                                list_empty
                                list_add_tail
                                unlock
      lock
      # list_empty might not hold anymore
      list_add_tail
      unlock

When this sequence happens, the list_add_tail() in
mem_cgroup_move_account() corrupt the list since which is already been
added to some split_queue in split_huge_page_to_list().

Besides this, David Rientjes points out the split_queue_len would be in
a wrong state, which would be a significant issue for shrinkers.

Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")

Signed-off-by: Wei Yang <richardw.yang@...ux.intel.com>
Cc: <stable@...r.kernel.org>    [5.4+]

---
v3:
  * remove all review/ack tag since rewrite the changelog
  * use deferred_split_huge_page as the example of race
  * add cc stable 5.4+ tag as suggested by David Rientjes

v2:
  * move check on compound outside suggested by Alexander
  * an example of the race condition, suggested by Michal
---
 mm/memcontrol.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c5b5f74cfd4d..6450bbe394e2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5360,10 +5360,12 @@ static int mem_cgroup_move_account(struct page *page,
 	}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (compound && !list_empty(page_deferred_list(page))) {
+	if (compound) {
 		spin_lock(&from->deferred_split_queue.split_queue_lock);
-		list_del_init(page_deferred_list(page));
-		from->deferred_split_queue.split_queue_len--;
+		if (!list_empty(page_deferred_list(page))) {
+			list_del_init(page_deferred_list(page));
+			from->deferred_split_queue.split_queue_len--;
+		}
 		spin_unlock(&from->deferred_split_queue.split_queue_lock);
 	}
 #endif
@@ -5377,11 +5379,13 @@ static int mem_cgroup_move_account(struct page *page,
 	page->mem_cgroup = to;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (compound && list_empty(page_deferred_list(page))) {
+	if (compound) {
 		spin_lock(&to->deferred_split_queue.split_queue_lock);
-		list_add_tail(page_deferred_list(page),
-			      &to->deferred_split_queue.split_queue);
-		to->deferred_split_queue.split_queue_len++;
+		if (list_empty(page_deferred_list(page))) {
+			list_add_tail(page_deferred_list(page),
+				      &to->deferred_split_queue.split_queue);
+			to->deferred_split_queue.split_queue_len++;
+		}
 		spin_unlock(&to->deferred_split_queue.split_queue_lock);
 	}
 #endif
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ