lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpHOJqrUw_rH6=-ykWGJvpGw-4_YyPgvTfhf2_2vpthVvQ@mail.gmail.com>
Date: Wed, 14 Jan 2026 09:26:52 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, Suren Baghdasaryan <surenb@...gle.com>, 
	Vlastimil Babka <vbabka@...e.cz>, Shakeel Butt <shakeel.butt@...ux.dev>, 
	David Hildenbrand <david@...nel.org>, Rik van Riel <riel@...riel.com>, Harry Yoo <harry.yoo@...cle.com>, 
	Jann Horn <jannh@...gle.com>, Mike Rapoport <rppt@...nel.org>, Michal Hocko <mhocko@...e.com>, 
	Pedro Falcato <pfalcato@...e.de>, Chris Li <chriscli@...gle.com>, 
	Barry Song <v-songbaohua@...o.com>, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 8/8] mm/rmap: separate out fork-only logic on anon_vma_clone()

On Tue, Jan 6, 2026 at 11:27 AM Liam R. Howlett <Liam.Howlett@...cle.com> wrote:
>
> * Lorenzo Stoakes <lorenzo.stoakes@...cle.com> [260106 10:05]:
> > Specify which operation is being performed to anon_vma_clone(), which
> > allows us to do checks specific to each operation type, as well as to
> > separate out and make clear that the anon_vma reuse logic is absolutely
> > specific to fork only.
> >
> > This opens the door to further refactorings and refinements later as we
> > have more information to work with.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
>
> A few minor things, but this looks correct.
>
> Reviewed-by: Liam R. Howlett <Liam.Howlett@...cle.com>

Reviewed-by: Suren Baghdasaryan <surenb@...gle.com>

>
> > ---
> >  mm/internal.h                    | 11 ++++-
> >  mm/rmap.c                        | 74 ++++++++++++++++++++++----------
> >  mm/vma.c                         |  6 +--
> >  tools/testing/vma/vma_internal.h | 11 ++++-
> >  4 files changed, 74 insertions(+), 28 deletions(-)
> >
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 4ba784023a9f..8baa7bd2b8f7 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -244,7 +244,16 @@ static inline void anon_vma_unlock_read(struct anon_vma *anon_vma)
> >
> >  struct anon_vma *folio_get_anon_vma(const struct folio *folio);
> >
> > -int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src);
> > +/* Operations which modify VMAs. */
> > +enum vma_operation {
> > +     VMA_OP_SPLIT,
> > +     VMA_OP_MERGE_UNFAULTED,
> > +     VMA_OP_REMAP,
> > +     VMA_OP_FORK,
> > +};
> > +
> > +int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src,
> > +     enum vma_operation operation);
> >  int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma);
> >  int  __anon_vma_prepare(struct vm_area_struct *vma);
> >  void unlink_anon_vmas(struct vm_area_struct *vma);
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 8f4393546bce..336b27e00238 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -233,12 +233,13 @@ int __anon_vma_prepare(struct vm_area_struct *vma)
> >  }
> >
> >  static void check_anon_vma_clone(struct vm_area_struct *dst,
> > -                              struct vm_area_struct *src)
> > +                              struct vm_area_struct *src,
> > +                              enum vma_operation operation)
>
> You could save a line here by putting src and operation on the same line
> and tabbing only twice, but sure.  This is true in earlier patches as
> well.
>
> >  {
> >       /* The write lock must be held. */
> >       mmap_assert_write_locked(src->vm_mm);
> > -     /* If not a fork (implied by dst->anon_vma) then must be on same mm. */
> > -     VM_WARN_ON_ONCE(dst->anon_vma && dst->vm_mm != src->vm_mm);
> > +     /* If not a fork then must be on same mm. */
> > +     VM_WARN_ON_ONCE(operation != VMA_OP_FORK && dst->vm_mm != src->vm_mm);
> >
> >       /* If we have anything to do src->anon_vma must be provided. */
> >       VM_WARN_ON_ONCE(!src->anon_vma && !list_empty(&src->anon_vma_chain));
> > @@ -250,6 +251,40 @@ static void check_anon_vma_clone(struct vm_area_struct *dst,
> >        * must be the same across dst and src.
> >        */
> >       VM_WARN_ON_ONCE(dst->anon_vma && dst->anon_vma != src->anon_vma);
> > +     /*
> > +      * Essentially equivalent to above - if not a no-op, we should expect
> > +      * dst->anon_vma to be set for everything except a fork.
> > +      */
> > +     VM_WARN_ON_ONCE(operation != VMA_OP_FORK && src->anon_vma &&
> > +                     !dst->anon_vma);
> > +     /* For the anon_vma to be compatible, it can only be singular. */
> > +     VM_WARN_ON_ONCE(operation == VMA_OP_MERGE_UNFAULTED &&
> > +                     !list_is_singular(&src->anon_vma_chain));
> > +#ifdef CONFIG_PER_VMA_LOCK
> > +     /* Only merging an unfaulted VMA leaves the destination attached. */
> > +     VM_WARN_ON_ONCE(operation != VMA_OP_MERGE_UNFAULTED &&
> > +                     vma_is_attached(dst));
> > +#endif
> > +}
> > +
>
> try seems to imply we can return something saying it didn't work out,
> but this is void.  Naming is hard.  reuse_anon_vma_if_necessary seems
> insane, so I don't really have anything better.
>
> > +static void try_to_reuse_anon_vma(struct vm_area_struct *dst,
> > +                               struct anon_vma *anon_vma)
> > +{
> > +     /* If already populated, nothing to do.*/
> > +     if (dst->anon_vma)
> > +             return;
>
> This is only used on VMA_OP_FORK, how is it populated?
>
> I assume this is a later refinement?
>
> > +
> > +     /*
> > +      * We reuse an anon_vma if any linking VMAs were unmapped and it has
> > +      * only a single child at most.
> > +      */
> > +     if (anon_vma->num_active_vmas > 0)
> > +             return;
> > +     if (anon_vma->num_children > 1)
> > +             return;
> > +
> > +     dst->anon_vma = anon_vma;
> > +     anon_vma->num_active_vmas++;
> >  }
> >
> >  static void cleanup_partial_anon_vmas(struct vm_area_struct *vma);
> > @@ -259,6 +294,7 @@ static void cleanup_partial_anon_vmas(struct vm_area_struct *vma);
> >   * all of the anon_vma objects contained within @src anon_vma_chain's.
> >   * @dst: The destination VMA with an empty anon_vma_chain.
> >   * @src: The source VMA we wish to duplicate.
> > + * @operation: The type of operation which resulted in the clone.
> >   *
> >   * This is the heart of the VMA side of the anon_vma implementation - we invoke
> >   * this function whenever we need to set up a new VMA's anon_vma state.
> > @@ -281,17 +317,17 @@ static void cleanup_partial_anon_vmas(struct vm_area_struct *vma);
> >   *
> >   * Returns: 0 on success, -ENOMEM on failure.
> >   */
> > -int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> > +int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src,
> > +                enum vma_operation operation)
> >  {
> >       struct anon_vma_chain *avc, *pavc;
> > +     struct anon_vma *active_anon_vma = src->anon_vma;
> >
> > -     check_anon_vma_clone(dst, src);
> > +     check_anon_vma_clone(dst, src, operation);
> >
> > -     if (!src->anon_vma)
> > +     if (!active_anon_vma)
> >               return 0;
> >
> > -     check_anon_vma_clone(dst, src);
> > -
> >       /*
> >        * Allocate AVCs. We don't need an anon_vma lock for this as we
> >        * are not updating the anon_vma rbtree nor are we changing
> > @@ -317,22 +353,14 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >               struct anon_vma *anon_vma = avc->anon_vma;
> >
> >               anon_vma_interval_tree_insert(avc, &anon_vma->rb_root);
> > -
> > -             /*
> > -              * Reuse existing anon_vma if it has no vma and only one
> > -              * anon_vma child.
> > -              *
> > -              * Root anon_vma is never reused:
> > -              * it has self-parent reference and at least one child.
> > -              */
> > -             if (!dst->anon_vma && src->anon_vma &&
> > -                 anon_vma->num_children < 2 &&
> > -                 anon_vma->num_active_vmas == 0)
> > -                     dst->anon_vma = anon_vma;
> > +             if (operation == VMA_OP_FORK)
> > +                     try_to_reuse_anon_vma(dst, anon_vma);
> >       }
> > -     if (dst->anon_vma)
> > +
> > +     if (operation != VMA_OP_FORK)
> >               dst->anon_vma->num_active_vmas++;
> > -     anon_vma_unlock_write(src->anon_vma);
> > +
> > +     anon_vma_unlock_write(active_anon_vma);
> >       return 0;
> >
> >   enomem_failure:
> > @@ -362,7 +390,7 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
> >        * First, attach the new VMA to the parent VMA's anon_vmas,
> >        * so rmap can find non-COWed pages in child processes.
> >        */
> > -     error = anon_vma_clone(vma, pvma);
> > +     error = anon_vma_clone(vma, pvma, VMA_OP_FORK);
> >       if (error)
> >               return error;
> >
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 4294ecdc23a5..2a063d6568d9 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -528,7 +528,7 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >       if (err)
> >               goto out_free_vmi;
> >
> > -     err = anon_vma_clone(new, vma);
> > +     err = anon_vma_clone(new, vma, VMA_OP_SPLIT);
> >       if (err)
> >               goto out_free_mpol;
> >
> > @@ -626,7 +626,7 @@ static int dup_anon_vma(struct vm_area_struct *dst,
> >
> >               vma_assert_write_locked(dst);
> >               dst->anon_vma = src->anon_vma;
> > -             ret = anon_vma_clone(dst, src);
> > +             ret = anon_vma_clone(dst, src, VMA_OP_MERGE_UNFAULTED);
> >               if (ret)
> >                       return ret;
> >
> > @@ -1899,7 +1899,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
> >               vma_set_range(new_vma, addr, addr + len, pgoff);
> >               if (vma_dup_policy(vma, new_vma))
> >                       goto out_free_vma;
> > -             if (anon_vma_clone(new_vma, vma))
> > +             if (anon_vma_clone(new_vma, vma, VMA_OP_REMAP))
> >                       goto out_free_mempol;
> >               if (new_vma->vm_file)
> >                       get_file(new_vma->vm_file);
> > diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
> > index 93e5792306d9..7fa56dcc53a6 100644
> > --- a/tools/testing/vma/vma_internal.h
> > +++ b/tools/testing/vma/vma_internal.h
> > @@ -600,6 +600,14 @@ struct mmap_action {
> >       bool hide_from_rmap_until_complete :1;
> >  };
> >
> > +/* Operations which modify VMAs. */
> > +enum vma_operation {
> > +     VMA_OP_SPLIT,
> > +     VMA_OP_MERGE_UNFAULTED,
> > +     VMA_OP_REMAP,
> > +     VMA_OP_FORK,
> > +};
> > +
> >  /*
> >   * Describes a VMA that is about to be mmap()'ed. Drivers may choose to
> >   * manipulate mutable fields which will cause those fields to be updated in the
> > @@ -1157,7 +1165,8 @@ static inline int vma_dup_policy(struct vm_area_struct *src, struct vm_area_stru
> >       return 0;
> >  }
> >
> > -static inline int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> > +static inline int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src,
> > +                              enum vma_operation operation)
> >  {
> >       /* For testing purposes. We indicate that an anon_vma has been cloned. */
> >       if (src->anon_vma != NULL) {
> > --
> > 2.52.0
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ