[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20221114235507.294320-3-mike.kravetz@oracle.com>
Date: Mon, 14 Nov 2022 15:55:06 -0800
From: Mike Kravetz <mike.kravetz@...cle.com>
To: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc: Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
David Hildenbrand <david@...hat.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Mina Almasry <almasrymina@...gle.com>,
Peter Xu <peterx@...hat.com>,
Nadav Amit <nadav.amit@...il.com>,
Rik van Riel <riel@...riel.com>,
Vlastimil Babka <vbabka@...e.cz>,
Matthew Wilcox <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Wei Chen <harperchen1110@...il.com>, stable@...r.kernel.org
Subject: [PATCH v10 2/3] hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page
tables associated with the address range. For hugetlb vmas,
zap_page_range will call __unmap_hugepage_range_final. However,
__unmap_hugepage_range_final assumes the passed vma is about to be removed
and deletes the vma_lock to prevent pmd sharing as the vma is on the way
out. In the case of madvise(MADV_DONTNEED) the vma remains, but the
missing vma_lock prevents pmd sharing and could potentially lead to issues
with truncation/fault races.
This issue was originally reported here [1] as a BUG triggered in
page_try_dup_anon_rmap. Prior to the introduction of the hugetlb
vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to
prevent pmd sharing. Subsequent faults on this vma were confused as
VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was
not set in new pages added to the page table. This resulted in pages that
appeared anonymous in a VM_SHARED vma and triggered the BUG.
Address issue by adding a new zap flag ZAP_FLAG_UNMAP to indicate an unmap
call from unmap_vmas(). This is used to indicate the 'final' unmapping of
a hugetlb vma. When called via MADV_DONTNEED, this flag is not set and the
vm_lock is not deleted.
[1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6JudD0g@mail.gmail.com/
Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
Signed-off-by: Mike Kravetz <mike.kravetz@...cle.com>
Reported-by: Wei Chen <harperchen1110@...il.com>
Cc: <stable@...r.kernel.org>
---
include/linux/mm.h | 2 ++
mm/hugetlb.c | 27 ++++++++++++++++-----------
mm/memory.c | 2 +-
3 files changed, 19 insertions(+), 12 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index dd5a38682537..a4e24dd2d96e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1886,6 +1886,8 @@ struct zap_details {
* default, the flag is not set.
*/
#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0))
+/* Set in unmap_vmas() to indicate a final unmap call. Only used by hugetlb */
+#define ZAP_FLAG_UNMAP ((__force zap_flags_t) BIT(1))
#ifdef CONFIG_MMU
extern bool can_do_mlock(void);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9d765364231e..7559b9dfe782 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5210,17 +5210,22 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb,
__unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags);
- /*
- * Unlock and free the vma lock before releasing i_mmap_rwsem. When
- * the vma_lock is freed, this makes the vma ineligible for pmd
- * sharing. And, i_mmap_rwsem is required to set up pmd sharing.
- * This is important as page tables for this unmapped range will
- * be asynchrously deleted. If the page tables are shared, there
- * will be issues when accessed by someone else.
- */
- __hugetlb_vma_unlock_write_free(vma);
-
- i_mmap_unlock_write(vma->vm_file->f_mapping);
+ if (zap_flags & ZAP_FLAG_UNMAP) { /* final unmap */
+ /*
+ * Unlock and free the vma lock before releasing i_mmap_rwsem.
+ * When the vma_lock is freed, this makes the vma ineligible
+ * for pmd sharing. And, i_mmap_rwsem is required to set up
+ * pmd sharing. This is important as page tables for this
+ * unmapped range will be asynchrously deleted. If the page
+ * tables are shared, there will be issues when accessed by
+ * someone else.
+ */
+ __hugetlb_vma_unlock_write_free(vma);
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ } else {
+ i_mmap_unlock_write(vma->vm_file->f_mapping);
+ hugetlb_vma_unlock_write(vma);
+ }
}
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
diff --git a/mm/memory.c b/mm/memory.c
index a177f6bbfafc..6d77bc00bca1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1673,7 +1673,7 @@ void unmap_vmas(struct mmu_gather *tlb, struct maple_tree *mt,
{
struct mmu_notifier_range range;
struct zap_details details = {
- .zap_flags = ZAP_FLAG_DROP_MARKER,
+ .zap_flags = ZAP_FLAG_DROP_MARKER | ZAP_FLAG_UNMAP,
/* Careful - we need to zap private pages too! */
.even_cows = true,
};
--
2.38.1
Powered by blists - more mailing lists