[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b5e16d20-0ed6-401f-9c01-8efa35c49f64@lucifer.local>
Date: Fri, 23 Jan 2026 13:45:02 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...nel.org>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Shakeel Butt <shakeel.butt@...ux.dev>, Jann Horn <jannh@...gle.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Boqun Feng <boqun.feng@...il.com>, Waiman Long <longman@...hat.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH RESEND v3 02/10] mm/vma: document possible vma->vm_refcnt
values and reference comment
On Thu, Jan 22, 2026 at 05:48:07PM +0100, Vlastimil Babka wrote:
> On 1/22/26 14:01, Lorenzo Stoakes wrote:
> > The possible vma->vm_refcnt values are confusing and vague, explain in
> > detail what these can be in a comment describing the vma->vm_refcnt field
> > and reference this comment in various places that read/write this field.
> >
> > No functional change intended.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
>
> Thanks, very useful. Forgive my nitpicks :) It's because it's tricky so best
> try to be as precise as possible, I believe.
Ack.
>
> > ---
> > include/linux/mm_types.h | 39 +++++++++++++++++++++++++++++++++++++--
> > include/linux/mmap_lock.h | 7 +++++++
> > mm/mmap_lock.c | 6 ++++++
> > 3 files changed, 50 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 94de392ed3c5..e5ee66f84d9a 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -758,7 +758,8 @@ static inline struct anon_vma_name *anon_vma_name_alloc(const char *name)
> > * set the VM_REFCNT_EXCLUDE_READERS_FLAG in vma->vm_refcnt to indiciate to
> > * vma_start_read() that the reference count should be left alone.
> > *
> > - * Once the operation is complete, this value is subtracted from vma->vm_refcnt.
> > + * See the comment describing vm_refcnt in vm_area_struct for details as to
> > + * which values the VMA reference count can be.
> > */
> > #define VM_REFCNT_EXCLUDE_READERS_BIT (30)
> > #define VM_REFCNT_EXCLUDE_READERS_FLAG (1U << VM_REFCNT_EXCLUDE_READERS_BIT)
> > @@ -989,7 +990,41 @@ struct vm_area_struct {
> > struct vma_numab_state *numab_state; /* NUMA Balancing state */
> > #endif
> > #ifdef CONFIG_PER_VMA_LOCK
> > - /* Unstable RCU readers are allowed to read this. */
> > + /*
> > + * Used to keep track of the number of references taken by VMA read or
> > + * write locks. May have the VM_REFCNT_EXCLUDE_READERS_FLAG set
>
> I wonder about the "or write locks" part. The process of acquiring it uses
> VM_REFCNT_EXCLUDE_READERS_FLAG but then the writer doesn't hold a 1
> refcount? (the sentence could be read it way IMHO) It's vma being attached
> that does, AFAIK?
Right the intent is to say that, in the process of excluding readers, a write
lock can vary the reference count.
It's a pity we can't describe the refcnt in some sensible, logical way as it's
really being overloaded quite a bit for multiple things.
It really isn't actually keeping track of references (other than read locks
taken).
OK so I have updated this to say:
* Used to keep track of firstly, whether the VMA is attached, secondly,
* if attached, how many read locks are taken, and thirdly, if the
* VM_REFCNT_EXCLUDE_READERS_FLAG is set, whether any read locks held
* are currently in the process of being excluded.
>
> > + * indicating that a thread has entered __vma_enter_locked() and is
> > + * waiting on any outstanding read locks to exit.
> > + *
> > + * This value can be equal to:
> > + *
> > + * 0 - Detached.
>
> Is it worth saying that readers can't increment the refcount?
Added, updated to say:
* 0 - Detached. IMPORTANT: when the refcnt is zero, readers cannot
* increment it.
>
> > + * 1 - Unlocked or write-locked.
>
> "Attached and either unlocked or write-locked." ?
>
> (see how "write-locked" isn't reflected, I argued above)
I'm not sure what you mean, a write lock requires the VMA to be attached (or it
bails out on attempted write lock). So it's kind of implicit right?
But yes better to be explicit, so have replaced with your suggestion.
>
> > + *
> > + * >1, < VM_REFCNT_EXCLUDE_READERS_FLAG - Read-locked or (unlikely)
> > + * write-locked with other threads having temporarily incremented the
> > + * reference count prior to determining it is write-locked and
> > + * decrementing it again.
>
> Ack.
>
> > + * VM_REFCNT_EXCLUDE_READERS_FLAG - Detached, pending
> > + * __vma_exit_locked() completion which will decrement the reference
> > + * count to zero. IMPORTANT - at this stage no further readers can
> > + * increment the reference count. It can only be reduced.
> > + *
> > + * VM_REFCNT_EXCLUDE_READERS_FLAG + 1 - Either an attached VMA pending
> > + * __vma_exit_locked() completion which will decrement the reference
> > + * count to one, OR a detached VMA waiting on a single spurious reader
> > + * to decrement reference count. IMPORTANT - as above, no further
> > + * readers can increment the reference count.
> > + *
> > + * > VM_REFCNT_EXCLUDE_READERS_FLAG + 1 - VMA is waiting on readers,
>
> "VMA is waiting" sounds weird? a thread might be, but VMA itself?
> (similarly in the previous paragraph)
I was trying to make it a bit more succinct that way I think, but agreed
it's unclear. Have replaced with:
* VM_REFCNT_EXCLUDE_READERS_FLAG + 1 - A thread is either write-locking
* an attached VMA and has yet to invoke __vma_exit_locked(), OR a
* thread is detaching a VMA and is waiting on a single spurious reader
* in order to decrement the reference count. IMPORTANT - as above, no
* further readers can increment the reference count.
*
* > VM_REFCNT_EXCLUDE_READERS_FLAG + 1 - A thread either
* write-locking or detaching a VMA is waiting on readers to
* exit. IMPORTANT - as above, no ruther readers can increment the
* reference count.
Cheers, Lorenzo
Powered by blists - more mailing lists