[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1769086312.git.lorenzo.stoakes@oracle.com>
Date: Thu, 22 Jan 2026 13:01:52 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: David Hildenbrand <david@...nel.org>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Shakeel Butt <shakeel.butt@...ux.dev>, Jann Horn <jannh@...gle.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Boqun Feng <boqun.feng@...il.com>, Waiman Long <longman@...hat.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>
Subject: [PATCH RESEND v3 00/10] mm: add and use vma_assert_stabilised() helper
Sometimes we wish to assert that a VMA is stable, that is - the VMA cannot
be changed underneath us. This will be the case if EITHER the VMA lock or
the mmap lock is held.
We already open-code this in two places - anon_vma_name() in mm/madvise.c
and vma_flag_set_atomic() in include/linux/mm.h.
This series adds vma_assert_stablised() which abstract this can be used in
these callsites instead.
This implementation uses lockdep where possible - that is VMA read locks -
which correctly track read lock acquisition/release via:
vma_start_read() ->
rwsem_acquire_read()
vma_start_read_locked() ->
vma_start_read_locked_nested() ->
rwsem_acquire_read()
And:
vma_end_read() ->
vma_refcount_put() ->
rwsem_release()
We don't track the VMA locks using lockdep for VMA write locks, however
these are predicated upon mmap write locks whose lockdep state we do track,
and additionally vma_assert_stabillised() asserts this check if VMA read
lock is not held, so we get lockdep coverage in this case also.
We also add extensive comments to describe what we're doing.
There's some tricky stuff around mmap locking and stabilisation races that
we have to be careful of that I describe in the patch introducing
vma_assert_stabilised().
This change also lays the foundation for future series to add this assert
in further places where we wish to make it clear that we rely upon a
stabilised VMA.
The motivation for this change was precisely this.
Addiitonally, refactor the VMA locks logic to be clearer, less confusing,
self-documenting as far as possible and more easily extendable and
debuggable in future.
v3:
* Added 8 patches of refactoring the VMA lock implementation :)
* Dropped the vma_is_*locked() predicates as too difficult to get entirely
right.
* Updated vma_assert_locked() to assert what we sensibly can, use lockdep
if possible and invoke vma_assert_write_locked() to share code as before.
* Took into account extensive feedback received from Vlastimil (thanks! :)
v2:
* Added lockdep as much as possible to the mix as per Peter and Sebastian.
* Added comments to make clear what we're doing in each case.
* I realise I made a mistake in saying the previous duplicative VMA stable
asserts were wrong - vma_assert_locked() is not a no-op if
!CONFIG_PER_VMA_LOCK, instead it degrades to asserting that the mmap lock
is held, so this is correct, though means we'd have checked this twice,
only triggering an assert the second time.
* Accounted for is_vma_writer_only() case in vma_is_read_locked().
* Accounted for two hideous issues - we cannot check VMA lock first,
because we may be holding a VMA write lock and be raced by VMA readers of
_other_ VMA's. If we check the mmap lock first and assert, we may hold a
VMA read lock and race other threads which hodl the mmap read lock and
fail an assert. We resolve this by a precise mmap ownership check if
lockdep is used, and allowing the check to be approximate if no lockdep.
* Added more comments and updated commit logs.
* Dropped Suren's Suggested-by as significant changes in this set (this was for
the vma_is_read_locked() as a concept).
https://lore.kernel.org/all/cover.1768855783.git.lorenzo.stoakes@oracle.com/
v1:
https://lore.kernel.org/all/cover.1768569863.git.lorenzo.stoakes@oracle.com/
Lorenzo Stoakes (10):
mm/vma: rename VMA_LOCK_OFFSET to VM_REFCNT_EXCLUDE_READERS_FLAG
mm/vma: document possible vma->vm_refcnt values and reference comment
mm/vma: rename is_vma_write_only(), separate out shared refcount put
mm/vma: add+use vma lockdep acquire/release defines
mm/vma: de-duplicate __vma_enter_locked() error path
mm/vma: clean up __vma_enter/exit_locked()
mm/vma: introduce helper struct + thread through exclusive lock fns
mm/vma: improve and document __is_vma_write_locked()
mm/vma: update vma_assert_locked() to use lockdep
mm/vma: add and use vma_assert_stabilised()
include/linux/mm.h | 5 +-
include/linux/mm_types.h | 54 ++++++++-
include/linux/mmap_lock.h | 223 ++++++++++++++++++++++++++++++++++----
mm/madvise.c | 4 +-
mm/mmap_lock.c | 180 ++++++++++++++++++++----------
5 files changed, 373 insertions(+), 93 deletions(-)
--
2.52.0
Powered by blists - more mailing lists