lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c82d75d1-5795-4401-92f8-58df6ac8dbd3@lucifer.local>
Date: Fri, 21 Nov 2025 17:44:43 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Muchun Song <muchun.song@...ux.dev>, Oscar Salvador <osalvador@...e.de>,
        David Hildenbrand <david@...hat.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
        Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Yuanchu Xie <yuanchu@...gle.com>, Wei Xu <weixugc@...gle.com>,
        Peter Xu <peterx@...hat.com>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
        Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
        Kees Cook <kees@...nel.org>, Matthew Wilcox <willy@...radead.org>,
        Jason Gunthorpe <jgg@...pe.ca>, John Hubbard <jhubbard@...dia.com>,
        Leon Romanovsky <leon@...nel.org>, Zi Yan <ziy@...dia.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
        Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
        Lance Yang <lance.yang@...ux.dev>, Xu Xin <xu.xin16@....com.cn>,
        Chengming Zhou <chengming.zhou@...ux.dev>,
        Jann Horn <jannh@...gle.com>, Matthew Brost <matthew.brost@...el.com>,
        Joshua Hahn <joshua.hahnjy@...il.com>, Rakie Kim <rakie.kim@...com>,
        Byungchul Park <byungchul@...com>, Gregory Price <gourry@...rry.net>,
        Ying Huang <ying.huang@...ux.alibaba.com>,
        Alistair Popple <apopple@...dia.com>, Pedro Falcato <pfalcato@...e.de>,
        Shakeel Butt <shakeel.butt@...ux.dev>,
        David Rientjes <rientjes@...gle.com>, Rik van Riel <riel@...riel.com>,
        Harry Yoo <harry.yoo@...cle.com>,
        Kemeng Shi <shikemeng@...weicloud.com>,
        Kairui Song <kasong@...cent.com>, Nhat Pham <nphamcs@...il.com>,
        Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Qi Zheng <zhengqi.arch@...edance.com>, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>,
        Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
        Bjorn Roy Baron <bjorn3_gh@...tonmail.com>,
        Benno Lossin <lossin@...nel.org>,
        Andreas Hindborg <a.hindborg@...nel.org>,
        Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
        Danilo Krummrich <dakr@...nel.org>, rust-for-linux@...r.kernel.org
Subject: Re: [PATCH v2 4/4] mm: introduce VMA flags bitmap type

As Vlastimil noticed, something has gone fairly horribly wrong here in the
actual commit [0] vs. the patch here for tools/testing/vma/vma_internal.h.

We should only have the delta shown here, let me know if I need to help with a
conflict resolution! :)

Thanks, Lorenzo

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-stable&id=c3f7c506e8f122a31b9cc01d234e7fcda46b0eca

On Fri, Nov 14, 2025 at 01:26:11PM +0000, Lorenzo Stoakes wrote:
> It is useful to transition to using a bitmap for VMA flags so we can avoid
> running out of flags, especially for 32-bit kernels which are constrained
> to 32 flags, necessitating some features to be limited to 64-bit kernels
> only.
>
> By doing so, we remove any constraint on the number of VMA flags moving
> forwards no matter the platform and can decide in future to extend beyond
> 64 if required.
>
> We start by declaring an opaque types, vma_flags_t (which resembles
> mm_struct flags of type mm_flags_t), setting it to precisely the same size
> as vm_flags_t, and place it in union with vm_flags in the VMA declaration.
>
> We additionally update struct vm_area_desc equivalently placing the new
> opaque type in union with vm_flags.
>
> This change therefore does not impact the size of struct vm_area_struct or
> struct vm_area_desc.
>
> In order for the change to be iterative and to avoid impacting performance,
> we designate VM_xxx declared bitmap flag values as those which must exist
> in the first system word of the VMA flags bitmap.
>
> We therefore declare vma_flags_clear_all(), vma_flags_overwrite_word(),
> vma_flags_overwrite_word(), vma_flags_overwrite_word_once(),
> vma_flags_set_word() and vma_flags_clear_word() in order to allow us to
> update the existing vm_flags_*() functions to utilise these helpers.
>
> This is a stepping stone towards converting users to the VMA flags bitmap
> and behaves precisely as before.
>
> By doing this, we can eliminate the existing private vma->__vm_flags field
> in the vma->vm_flags union and replace it with the newly introduced opaque
> type vma_flags, which we call flags so we refer to the new bitmap field as
> vma->flags.
>
> We update vma_flag_[test, set]_atomic() to account for the change also.
>
> We additionally update the VMA userland test declarations to implement the
> same changes there.
>
> Finally, we update the rust code to reference vma->vm_flags on update
> rather than vma->__vm_flags which has been removed. This is safe for now,
> albeit it is implicitly performing a const cast.
>
> Once we introduce flag helpers we can improve this more.
>
> No functional change intended.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> ---
>  include/linux/mm.h               |  18 ++--
>  include/linux/mm_types.h         |  64 +++++++++++++-
>  rust/kernel/mm/virt.rs           |   2 +-
>  tools/testing/vma/vma_internal.h | 143 ++++++++++++++++++++++++++-----
>  4 files changed, 196 insertions(+), 31 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index ad000c472bd5..79345c44a350 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -919,7 +919,8 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm)
>  static inline void vm_flags_init(struct vm_area_struct *vma,
>  				 vm_flags_t flags)
>  {
> -	ACCESS_PRIVATE(vma, __vm_flags) = flags;
> +	vma_flags_clear_all(&vma->flags);
> +	vma_flags_overwrite_word(&vma->flags, flags);
>  }
>
>  /*
> @@ -938,21 +939,26 @@ static inline void vm_flags_reset_once(struct vm_area_struct *vma,
>  				       vm_flags_t flags)
>  {
>  	vma_assert_write_locked(vma);
> -	WRITE_ONCE(ACCESS_PRIVATE(vma, __vm_flags), flags);
> +	/*
> +	 * The user should only be interested in avoiding reordering of
> +	 * assignment to the first word.
> +	 */
> +	vma_flags_clear_all(&vma->flags);
> +	vma_flags_overwrite_word_once(&vma->flags, flags);
>  }
>
>  static inline void vm_flags_set(struct vm_area_struct *vma,
>  				vm_flags_t flags)
>  {
>  	vma_start_write(vma);
> -	ACCESS_PRIVATE(vma, __vm_flags) |= flags;
> +	vma_flags_set_word(&vma->flags, flags);
>  }
>
>  static inline void vm_flags_clear(struct vm_area_struct *vma,
>  				  vm_flags_t flags)
>  {
>  	vma_start_write(vma);
> -	ACCESS_PRIVATE(vma, __vm_flags) &= ~flags;
> +	vma_flags_clear_word(&vma->flags, flags);
>  }
>
>  /*
> @@ -995,12 +1001,14 @@ static inline bool __vma_flag_atomic_valid(struct vm_area_struct *vma,
>  static inline void vma_flag_set_atomic(struct vm_area_struct *vma,
>  				       vma_flag_t bit)
>  {
> +	unsigned long *bitmap = ACCESS_PRIVATE(&vma->flags, __vma_flags);
> +
>  	/* mmap read lock/VMA read lock must be held. */
>  	if (!rwsem_is_locked(&vma->vm_mm->mmap_lock))
>  		vma_assert_locked(vma);
>
>  	if (__vma_flag_atomic_valid(vma, bit))
> -		set_bit((__force int)bit, &ACCESS_PRIVATE(vma, __vm_flags));
> +		set_bit((__force int)bit, bitmap);
>  }
>
>  /*
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3550672e0f9e..b71625378ce3 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -848,6 +848,15 @@ struct mmap_action {
>  	bool hide_from_rmap_until_complete :1;
>  };
>
> +/*
> + * Opaque type representing current VMA (vm_area_struct) flag state. Must be
> + * accessed via vma_flags_xxx() helper functions.
> + */
> +#define NUM_VMA_FLAG_BITS BITS_PER_LONG
> +typedef struct {
> +	DECLARE_BITMAP(__vma_flags, NUM_VMA_FLAG_BITS);
> +} __private vma_flags_t;
> +
>  /*
>   * Describes a VMA that is about to be mmap()'ed. Drivers may choose to
>   * manipulate mutable fields which will cause those fields to be updated in the
> @@ -865,7 +874,10 @@ struct vm_area_desc {
>  	/* Mutable fields. Populated with initial state. */
>  	pgoff_t pgoff;
>  	struct file *vm_file;
> -	vm_flags_t vm_flags;
> +	union {
> +		vm_flags_t vm_flags;
> +		vma_flags_t vma_flags;
> +	};
>  	pgprot_t page_prot;
>
>  	/* Write-only fields. */
> @@ -910,10 +922,12 @@ struct vm_area_struct {
>  	/*
>  	 * Flags, see mm.h.
>  	 * To modify use vm_flags_{init|reset|set|clear|mod} functions.
> +	 * Preferably, use vma_flags_xxx() functions.
>  	 */
>  	union {
> +		/* Temporary while VMA flags are being converted. */
>  		const vm_flags_t vm_flags;
> -		vm_flags_t __private __vm_flags;
> +		vma_flags_t flags;
>  	};
>
>  #ifdef CONFIG_PER_VMA_LOCK
> @@ -994,6 +1008,52 @@ struct vm_area_struct {
>  #endif
>  } __randomize_layout;
>
> +/* Clears all bits in the VMA flags bitmap, non-atomically. */
> +static inline void vma_flags_clear_all(vma_flags_t *flags)
> +{
> +	bitmap_zero(ACCESS_PRIVATE(flags, __vma_flags), NUM_VMA_FLAG_BITS);
> +}
> +
> +/*
> + * Copy value to the first system word of VMA flags, non-atomically.
> + *
> + * IMPORTANT: This does not overwrite bytes past the first system word. The
> + * caller must account for this.
> + */
> +static inline void vma_flags_overwrite_word(vma_flags_t *flags, unsigned long value)
> +{
> +	*ACCESS_PRIVATE(flags, __vma_flags) = value;
> +}
> +
> +/*
> + * Copy value to the first system word of VMA flags ONCE, non-atomically.
> + *
> + * IMPORTANT: This does not overwrite bytes past the first system word. The
> + * caller must account for this.
> + */
> +static inline void vma_flags_overwrite_word_once(vma_flags_t *flags, unsigned long value)
> +{
> +	unsigned long *bitmap = ACCESS_PRIVATE(flags, __vma_flags);
> +
> +	WRITE_ONCE(*bitmap, value);
> +}
> +
> +/* Update the first system word of VMA flags setting bits, non-atomically. */
> +static inline void vma_flags_set_word(vma_flags_t *flags, unsigned long value)
> +{
> +	unsigned long *bitmap = ACCESS_PRIVATE(flags, __vma_flags);
> +
> +	*bitmap |= value;
> +}
> +
> +/* Update the first system word of VMA flags clearing bits, non-atomically. */
> +static inline void vma_flags_clear_word(vma_flags_t *flags, unsigned long value)
> +{
> +	unsigned long *bitmap = ACCESS_PRIVATE(flags, __vma_flags);
> +
> +	*bitmap &= ~value;
> +}
> +
>  #ifdef CONFIG_NUMA
>  #define vma_policy(vma) ((vma)->vm_policy)
>  #else
> diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> index a1bfa4e19293..da21d65ccd20 100644
> --- a/rust/kernel/mm/virt.rs
> +++ b/rust/kernel/mm/virt.rs
> @@ -250,7 +250,7 @@ unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) {
>          // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet
>          // shared. Additionally, `VmaNew` is `!Sync`, so it cannot be used to write in parallel.
>          // The caller promises that this does not set the flags to an invalid value.
> -        unsafe { (*self.as_ptr()).__bindgen_anon_2.__vm_flags = flags };
> +        unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags = flags };
>      }
>
>      /// Set the `VM_MIXEDMAP` flag on this vma.
> diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
> index 18659214e262..13ee825bdfcf 100644
> --- a/tools/testing/vma/vma_internal.h
> +++ b/tools/testing/vma/vma_internal.h
> @@ -528,6 +528,15 @@ typedef struct {
>  	__private DECLARE_BITMAP(__mm_flags, NUM_MM_FLAG_BITS);
>  } mm_flags_t;
>
> +/*
> + * Opaque type representing current VMA (vm_area_struct) flag state. Must be
> + * accessed via vma_flags_xxx() helper functions.
> + */
> +#define NUM_VMA_FLAG_BITS BITS_PER_LONG
> +typedef struct {
> +	DECLARE_BITMAP(__vma_flags, NUM_VMA_FLAG_BITS);
> +} __private vma_flags_t;
> +
>  struct mm_struct {
>  	struct maple_tree mm_mt;
>  	int map_count;			/* number of VMAs */
> @@ -612,7 +621,10 @@ struct vm_area_desc {
>  	/* Mutable fields. Populated with initial state. */
>  	pgoff_t pgoff;
>  	struct file *vm_file;
> -	vm_flags_t vm_flags;
> +	union {
> +		vm_flags_t vm_flags;
> +		vma_flags_t vma_flags;
> +	};
>  	pgprot_t page_prot;
>
>  	/* Write-only fields. */
> @@ -658,7 +670,7 @@ struct vm_area_struct {
>  	 */
>  	union {
>  		const vm_flags_t vm_flags;
> -		vm_flags_t __private __vm_flags;
> +		vma_flags_t flags;
>  	};
>
>  #ifdef CONFIG_PER_VMA_LOCK
> @@ -1372,26 +1384,6 @@ static inline bool may_expand_vm(struct mm_struct *mm, vm_flags_t flags,
>  	return true;
>  }
>
> -static inline void vm_flags_init(struct vm_area_struct *vma,
> -				 vm_flags_t flags)
> -{
> -	vma->__vm_flags = flags;
> -}
> -
> -static inline void vm_flags_set(struct vm_area_struct *vma,
> -				vm_flags_t flags)
> -{
> -	vma_start_write(vma);
> -	vma->__vm_flags |= flags;
> -}
> -
> -static inline void vm_flags_clear(struct vm_area_struct *vma,
> -				  vm_flags_t flags)
> -{
> -	vma_start_write(vma);
> -	vma->__vm_flags &= ~flags;
> -}
> -
>  static inline int shmem_zero_setup(struct vm_area_struct *vma)
>  {
>  	return 0;
> @@ -1548,13 +1540,118 @@ static inline void userfaultfd_unmap_complete(struct mm_struct *mm,
>  {
>  }
>
> -# define ACCESS_PRIVATE(p, member) ((p)->member)
> +#define ACCESS_PRIVATE(p, member) ((p)->member)
> +
> +#define bitmap_size(nbits)	(ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
> +
> +static __always_inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
> +{
> +	unsigned int len = bitmap_size(nbits);
> +
> +	if (small_const_nbits(nbits))
> +		*dst = 0;
> +	else
> +		memset(dst, 0, len);
> +}
>
>  static inline bool mm_flags_test(int flag, const struct mm_struct *mm)
>  {
>  	return test_bit(flag, ACCESS_PRIVATE(&mm->flags, __mm_flags));
>  }
>
> +/* Clears all bits in the VMA flags bitmap, non-atomically. */
> +static inline void vma_flags_clear_all(vma_flags_t *flags)
> +{
> +	bitmap_zero(ACCESS_PRIVATE(flags, __vma_flags), NUM_VMA_FLAG_BITS);
> +}
> +
> +/*
> + * Copy value to the first system word of VMA flags, non-atomically.
> + *
> + * IMPORTANT: This does not overwrite bytes past the first system word. The
> + * caller must account for this.
> + */
> +static inline void vma_flags_overwrite_word(vma_flags_t *flags, unsigned long value)
> +{
> +	*ACCESS_PRIVATE(flags, __vma_flags) = value;
> +}
> +
> +/*
> + * Copy value to the first system word of VMA flags ONCE, non-atomically.
> + *
> + * IMPORTANT: This does not overwrite bytes past the first system word. The
> + * caller must account for this.
> + */
> +static inline void vma_flags_overwrite_word_once(vma_flags_t *flags, unsigned long value)
> +{
> +	unsigned long *bitmap = ACCESS_PRIVATE(flags, __vma_flags);
> +
> +	WRITE_ONCE(*bitmap, value);
> +}
> +
> +/* Update the first system word of VMA flags setting bits, non-atomically. */
> +static inline void vma_flags_set_word(vma_flags_t *flags, unsigned long value)
> +{
> +	unsigned long *bitmap = ACCESS_PRIVATE(flags, __vma_flags);
> +
> +	*bitmap |= value;
> +}
> +
> +/* Update the first system word of VMA flags clearing bits, non-atomically. */
> +static inline void vma_flags_clear_word(vma_flags_t *flags, unsigned long value)
> +{
> +	unsigned long *bitmap = ACCESS_PRIVATE(flags, __vma_flags);
> +
> +	*bitmap &= ~value;
> +}
> +
> +
> +/* Use when VMA is not part of the VMA tree and needs no locking */
> +static inline void vm_flags_init(struct vm_area_struct *vma,
> +				 vm_flags_t flags)
> +{
> +	vma_flags_clear_all(&vma->flags);
> +	vma_flags_overwrite_word(&vma->flags, flags);
> +}
> +
> +/*
> + * Use when VMA is part of the VMA tree and modifications need coordination
> + * Note: vm_flags_reset and vm_flags_reset_once do not lock the vma and
> + * it should be locked explicitly beforehand.
> + */
> +static inline void vm_flags_reset(struct vm_area_struct *vma,
> +				  vm_flags_t flags)
> +{
> +	vma_assert_write_locked(vma);
> +	vm_flags_init(vma, flags);
> +}
> +
> +static inline void vm_flags_reset_once(struct vm_area_struct *vma,
> +				       vm_flags_t flags)
> +{
> +	vma_assert_write_locked(vma);
> +	/*
> +	 * The user should only be interested in avoiding reordering of
> +	 * assignment to the first word.
> +	 */
> +	vma_flags_clear_all(&vma->flags);
> +	vma_flags_overwrite_word_once(&vma->flags, flags);
> +}
> +
> +static inline void vm_flags_set(struct vm_area_struct *vma,
> +				vm_flags_t flags)
> +{
> +	vma_start_write(vma);
> +	vma_flags_set_word(&vma->flags, flags);
> +}
> +
> +static inline void vm_flags_clear(struct vm_area_struct *vma,
> +				  vm_flags_t flags)
> +{
> +	vma_start_write(vma);
> +	vma_flags_clear_word(&vma->flags, flags);
> +}
> +
>  /*
>   * Denies creating a writable executable mapping or gaining executable permissions.
>   *
> --
> 2.51.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ