[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <30f843d9-03cf-4c7c-8a29-8e11b12e47e4@suse.cz>
Date: Tue, 20 Jan 2026 14:53:30 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: David Hildenbrand <david@...nel.org>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>, Mike Rapoport
<rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>, Shakeel Butt <shakeel.butt@...ux.dev>,
Jann Horn <jannh@...gle.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Will Deacon <will@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
Waiman Long <longman@...hat.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>, Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH v2 1/2] mm/vma: use lockdep where we can, reduce
duplication
On 1/19/26 21:59, Lorenzo Stoakes wrote:
> We introduce vma_is_read_locked(), which must deal with the case in which
> VMA write lock sets refcnt to VMA_LOCK_OFFSET or VMA_LOCK_OFFSET +
> 1. Luckily is_vma_writer_only() already exists which we can use to check
> this.
So I think there's a bit of a caveat in that
- is_vma_writer_only() may be a false positive if there is a temporary
reader of a detached vma (per comments in vma_mark_detached() and
vma_mark_detached())
- hence vma_is_read_locked() may be a false negative
- hence vma_assert_locked() might assume wrongly that we should not assert
being a reader, so we vma_assert_write_locked() instead, and fail
Howevever the above should mean it could be only us who is the temporary
reader. And we are not going to use vma_assert_locked() during the temporary
reader part (in vma_start_read()).
So it's probably fine, but maybe worth some comments to prevent people
getting suspicious and reconstructing this?
But I think perhaps also vma_assert_locked() could, with lockdep enabled
(similarly to vma_assert_stabilised() in patch 2), use the
"lock_is_held(&vma->vmlock_dep_map)" condition (without immediately
asserting it) for the primary reader vs writer decision, and not rely on
vma_is_read_locked()? Because lockdep has the precise information.
It would likely make things more ugly, or require more refactoring, but
hopefully worthwhile?
> We then try to make vma_assert_locked() use lockdep as far as we can.
>
> Unfortunately the VMA lock implementation does not even try to track VMA
> write locks using lockdep, so we cannot track the lock this way.
>
> This is less egregious than it might seem as VMA write locks are predicated
> on mmap write locks, which we do lockdep assert.
>
> vma_assert_write_locked() already asserts the mmap write lock is taken so
> we get that checked implicitly.
> However for read locks we do indeed use lockdup, via rwsem_acquire_read()
> called in vma_start_read() and rwsem_release_read() called in
> vma_refcount_put() called in turn by vma_end_read().
>
> Therefore we perform a lockdep assertion if the VMA is known to be
> read-locked.
>
> If it is write-locked, we assert the mmap lock instead, with a lockdep
> check if lockdep is enabled.
>
> If lockdep is not enabled, we just check that locks are in place.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> ---
> include/linux/mmap_lock.h | 34 ++++++++++++++++++++++++++++++----
> 1 file changed, 30 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
> index b50416fbba20..6979222882f1 100644
> --- a/include/linux/mmap_lock.h
> +++ b/include/linux/mmap_lock.h
> @@ -236,6 +236,13 @@ int vma_start_write_killable(struct vm_area_struct *vma)
> return __vma_start_write(vma, mm_lock_seq, TASK_KILLABLE);
> }
>
> +static inline bool vma_is_read_locked(const struct vm_area_struct *vma)
> +{
> + const unsigned int refcnt = refcount_read(&vma->vm_refcnt);
> +
> + return refcnt > 1 && !is_vma_writer_only(refcnt);
> +}
> +
> static inline void vma_assert_write_locked(struct vm_area_struct *vma)
> {
> unsigned int mm_lock_seq;
> @@ -243,12 +250,31 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma)
> VM_BUG_ON_VMA(!__is_vma_write_locked(vma, &mm_lock_seq), vma);
> }
>
> +/**
> + * vma_assert_locked() - Assert that @vma is either read or write locked and
> + * that we have ownership of that lock (if lockdep is enabled).
> + * @vma: The VMA we assert.
> + *
> + * If lockdep is enabled, we ensure ownership of the VMA lock. Otherwise we
> + * assert that we are VMA write-locked, which implicitly asserts that we hold
> + * the mmap write lock.
> + */
> static inline void vma_assert_locked(struct vm_area_struct *vma)
> {
> - unsigned int mm_lock_seq;
> -
> - VM_BUG_ON_VMA(refcount_read(&vma->vm_refcnt) <= 1 &&
> - !__is_vma_write_locked(vma, &mm_lock_seq), vma);
> + /*
> + * VMA locks currently only utilise lockdep for read locks, as
> + * vma_end_write_all() releases an unknown number of VMA write locks and
> + * we don't currently walk the maple tree to identify which locks are
> + * released even under CONFIG_LOCKDEP.
> + *
> + * However, VMA write locks are predicated on an mmap write lock, which
> + * we DO track under lockdep, and which vma_assert_write_locked()
> + * asserts.
> + */
> + if (vma_is_read_locked(vma))
> + lockdep_assert(lock_is_held(&vma->vmlock_dep_map));
> + else
> + vma_assert_write_locked(vma);
> }
>
> static inline bool vma_is_attached(struct vm_area_struct *vma)
Powered by blists - more mailing lists