[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241216211520.GB9803@noisy.programming.kicks-ass.net>
Date: Mon, 16 Dec 2024 22:15:20 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: akpm@...ux-foundation.org, willy@...radead.org, liam.howlett@...cle.com,
lorenzo.stoakes@...cle.com, mhocko@...e.com, vbabka@...e.cz,
hannes@...xchg.org, mjguzik@...il.com, oliver.sang@...el.com,
mgorman@...hsingularity.net, david@...hat.com, peterx@...hat.com,
oleg@...hat.com, dave@...olabs.net, paulmck@...nel.org,
brauner@...nel.org, dhowells@...hat.com, hdanton@...a.com,
hughd@...gle.com, lokeshgidra@...gle.com, minchan@...gle.com,
jannh@...gle.com, shakeel.butt@...ux.dev, souravpanda@...gle.com,
pasha.tatashin@...een.com, klarasmodin@...il.com, corbet@....net,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a
reference count
On Mon, Dec 16, 2024 at 11:24:13AM -0800, Suren Baghdasaryan wrote:
FWIW, I find the whole VMA_STATE_{A,DE}TATCHED thing awkward. And
perhaps s/VMA_STATE_LOCKED/VMA_LOCK_OFFSET/ ?
Also, perhaps:
#define VMA_REF_LIMIT (VMA_LOCK_OFFSET - 2)
> @@ -699,10 +700,27 @@ static inline void vma_numab_state_free(struct vm_area_struct *vma) {}
> #ifdef CONFIG_PER_VMA_LOCK
> static inline void vma_lock_init(struct vm_area_struct *vma)
> {
> - init_rwsem(&vma->vm_lock.lock);
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> + static struct lock_class_key lockdep_key;
> +
> + lockdep_init_map(&vma->vmlock_dep_map, "vm_lock", &lockdep_key, 0);
> +#endif
> + refcount_set(&vma->vm_refcnt, VMA_STATE_DETACHED);
> vma->vm_lock_seq = UINT_MAX;
Depending on how you do the actual allocation (GFP_ZERO) you might want
to avoid that vm_refcount store entirely.
Perhaps instead write: VM_WARN_ON(refcount_read(&vma->vm_refcnt));
> @@ -813,25 +849,42 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma)
>
> static inline void vma_assert_locked(struct vm_area_struct *vma)
> {
> - if (!rwsem_is_locked(&vma->vm_lock.lock))
> + if (refcount_read(&vma->vm_refcnt) <= VMA_STATE_ATTACHED)
if (is_vma_detached(vma))
> vma_assert_write_locked(vma);
> }
>
> -static inline void vma_mark_attached(struct vm_area_struct *vma)
> +/*
> + * WARNING: to avoid racing with vma_mark_attached(), should be called either
> + * under mmap_write_lock or when the object has been isolated under
> + * mmap_write_lock, ensuring no competing writers.
> + */
> +static inline bool is_vma_detached(struct vm_area_struct *vma)
> {
> - vma->detached = false;
> + return refcount_read(&vma->vm_refcnt) == VMA_STATE_DETACHED;
return !refcount_read(&vma->vm_refcnt);
> }
>
> -static inline void vma_mark_detached(struct vm_area_struct *vma)
> +static inline void vma_mark_attached(struct vm_area_struct *vma)
> {
> - /* When detaching vma should be write-locked */
> vma_assert_write_locked(vma);
> - vma->detached = true;
> +
> + if (is_vma_detached(vma))
> + refcount_set(&vma->vm_refcnt, VMA_STATE_ATTACHED);
Urgh, so it would be really good to not call this at all them not 0.
I've not tried to untangle the mess, but this is really awkward. Surely
you don't add it to the mas multiple times either.
Also:
refcount_set(&vma->vm_refcnt, 1);
is so much clearer.
That is, should this not live in vma_iter_store*(), right before
mas_store_gfp() ?
Also, ISTR having to set vm_lock_seq right before it?
> }
>
> -static inline bool is_vma_detached(struct vm_area_struct *vma)
> +static inline void vma_mark_detached(struct vm_area_struct *vma)
> {
> - return vma->detached;
> + vma_assert_write_locked(vma);
> +
> + if (is_vma_detached(vma))
> + return;
Again, this just reads like confusion :/ Surely you don't have the same
with mas_detach?
> +
> + /* We are the only writer, so no need to use vma_refcount_put(). */
> + if (!refcount_dec_and_test(&vma->vm_refcnt)) {
> + /*
> + * Reader must have temporarily raised vm_refcnt but it will
> + * drop it without using the vma since vma is write-locked.
> + */
> + }
> }
Powered by blists - more mailing lists