[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240805-fix-vma-lock-type-confusion-v1-1-9f25443a9a71@google.com>
Date: Mon, 05 Aug 2024 14:52:03 +0200
From: Jann Horn <jannh@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Suren Baghdasaryan <surenb@...gle.com>
Cc: Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Jann Horn <jannh@...gle.com>
Subject: [PATCH] mm: fix (harmless) type confusion in lock_vma_under_rcu()
There is a (harmless) type confusion in lock_vma_under_rcu():
After vma_start_read(), we have taken the VMA lock but don't know yet
whether the VMA has already been detached and scheduled for RCU freeing.
At this point, ->vm_start and ->vm_end are accessed.
vm_area_struct contains a union such that ->vm_rcu uses the same memory as
->vm_start and ->vm_end; so accessing ->vm_start and ->vm_end of a detached
VMA is illegal and leads to type confusion between union members.
Fix it by reordering the vma->detached check above the address checks, and
document the rules for RCU readers accessing VMAs.
This will probably change the number of observed VMA_LOCK_MISS events
(since previously, trying to access a detached VMA whose ->vm_rcu has been
scheduled would bail out when checking the fault address against the
rcu_head members reinterpreted as VMA bounds).
Fixes: 50ee32537206 ("mm: introduce lock_vma_under_rcu to be used from arch-specific code")
Signed-off-by: Jann Horn <jannh@...gle.com>
---
sidenote: I'm not entirely sure why we handle detached VMAs and moved
VMAs differently here (detached VMAs retry, moved VMAs bail out), but
that's kinda out of scope of this patch.
---
include/linux/mm_types.h | 15 +++++++++++++--
mm/memory.c | 14 ++++++++++----
2 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 485424979254..498cdf3121d0 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -657,58 +657,69 @@ struct vma_numab_state {
/*
* This struct describes a virtual memory area. There is one of these
* per VM-area/task. A VM area is any part of the process virtual memory
* space that has a special rule for the page-fault handlers (ie a shared
* library, the executable area etc).
+ *
+ * Only explicitly marked struct members may be accessed by RCU readers before
+ * getting a stable reference.
*/
struct vm_area_struct {
/* The first cache line has the info for VMA tree walking. */
union {
struct {
/* VMA covers [vm_start; vm_end) addresses within mm */
unsigned long vm_start;
unsigned long vm_end;
};
#ifdef CONFIG_PER_VMA_LOCK
struct rcu_head vm_rcu; /* Used for deferred freeing. */
#endif
};
- struct mm_struct *vm_mm; /* The address space we belong to. */
+ /*
+ * The address space we belong to.
+ * Unstable RCU readers are allowed to read this.
+ */
+ struct mm_struct *vm_mm;
pgprot_t vm_page_prot; /* Access permissions of this VMA. */
/*
* Flags, see mm.h.
* To modify use vm_flags_{init|reset|set|clear|mod} functions.
*/
union {
const vm_flags_t vm_flags;
vm_flags_t __private __vm_flags;
};
#ifdef CONFIG_PER_VMA_LOCK
- /* Flag to indicate areas detached from the mm->mm_mt tree */
+ /*
+ * Flag to indicate areas detached from the mm->mm_mt tree.
+ * Unstable RCU readers are allowed to read this.
+ */
bool detached;
/*
* Can only be written (using WRITE_ONCE()) while holding both:
* - mmap_lock (in write mode)
* - vm_lock->lock (in write mode)
* Can be read reliably while holding one of:
* - mmap_lock (in read or write mode)
* - vm_lock->lock (in read or write mode)
* Can be read unreliably (using READ_ONCE()) for pessimistic bailout
* while holding nothing (except RCU to keep the VMA struct allocated).
*
* This sequence counter is explicitly allowed to overflow; sequence
* counter reuse can only lead to occasional unnecessary use of the
* slowpath.
*/
int vm_lock_seq;
+ /* Unstable RCU readers are allowed to read this. */
struct vma_lock *vm_lock;
#endif
/*
* For areas with an address space and backing store,
* linkage into the address_space->i_mmap interval tree.
diff --git a/mm/memory.c b/mm/memory.c
index 34f8402d2046..3f4232b985a1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5996,23 +5996,29 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
if (!vma)
goto inval;
if (!vma_start_read(vma))
goto inval;
- /* Check since vm_start/vm_end might change before we lock the VMA */
- if (unlikely(address < vma->vm_start || address >= vma->vm_end))
- goto inval_end_read;
-
/* Check if the VMA got isolated after we found it */
if (vma->detached) {
vma_end_read(vma);
count_vm_vma_lock_event(VMA_LOCK_MISS);
/* The area was replaced with another one */
goto retry;
}
+ /*
+ * At this point, we have a stable reference to a VMA: The VMA is
+ * locked and we know it hasn't already been isolated.
+ * From here on, we can access the VMA without worrying about which
+ * fields are accessible for RCU readers.
+ */
+
+ /* Check since vm_start/vm_end might change before we lock the VMA */
+ if (unlikely(address < vma->vm_start || address >= vma->vm_end))
+ goto inval_end_read;
rcu_read_unlock();
return vma;
inval_end_read:
vma_end_read(vma);
---
base-commit: de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed
change-id: 20240805-fix-vma-lock-type-confusion-0a956d9d31ae
--
Jann Horn <jannh@...gle.com>
Powered by blists - more mailing lists