[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <srdmj5c2yb2qshyp2nkujk3tbcilkoaryzevchkv4rzeps63kh@rw3kkagts3o4>
Date: Thu, 17 Oct 2024 13:41:15 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Oliver Sang <oliver.sang@...el.com>
Subject: Re: [PATCH hotfix 6.12 1/2] mm/vma: add expand-only VMA merge mode
and optimise do_brk_flags()
* Lorenzo Stoakes <lorenzo.stoakes@...cle.com> [241017 10:31]:
> We know in advance that do_brk_flags() wants only to perform a VMA
> expansion (if the prior VMA is compatible), and that we assume no mergeable
> VMA follows it.
>
> These are the semantics of this function prior to the recent rewrite of the
> VMA merging logic, however we are now doing more work than necessary -
> positioning the VMA iterator at the prior VMA and performing tasks that are
> not required.
>
> Add a new field to the vmg struct to permit merge flags and add a new merge
> flag VMG_FLAG_JUST_EXPAND which implies this behaviour, and have
> do_brk_flags() use this.
Funny, I was thinking we could do this for relocate_vma_down() as well,
bu that's expanding in the wrong direction so we'd have to add
VMG_FLAG_JUST_EXPAND_DOWN, or the like. I'm not sure it's worth doing
since the expand down happens a lot less often.
>
> This fixes a reported performance regression in a brk() benchmarking suite.
>
> Reported-by: kernel test robot <oliver.sang@...el.com>
> Closes: https://lore.kernel.org/linux-mm/202409301043.629bea78-oliver.sang@intel.com
> Fixes: cacded5e42b9 ("mm: avoid using vma_merge() for new VMAs")
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@...cle.com>
> ---
> mm/mmap.c | 3 ++-
> mm/vma.c | 23 +++++++++++++++--------
> mm/vma.h | 14 ++++++++++++++
> 3 files changed, 31 insertions(+), 9 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index dd4b35a25aeb..4a13d9f138f6 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1744,7 +1744,8 @@ static int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma,
> VMG_STATE(vmg, mm, vmi, addr, addr + len, flags, PHYS_PFN(addr));
>
> vmg.prev = vma;
> - vma_iter_next_range(vmi);
> + /* vmi is positioned at prev, which this mode expects. */
> + vmg.merge_flags = VMG_FLAG_JUST_EXPAND;
>
> if (vma_merge_new_range(&vmg))
> goto out;
> diff --git a/mm/vma.c b/mm/vma.c
> index 4737afcb064c..b21ffec33f8e 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -917,6 +917,7 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
> pgoff_t pgoff = vmg->pgoff;
> pgoff_t pglen = PHYS_PFN(end - start);
> bool can_merge_left, can_merge_right;
> + bool just_expand = vmg->merge_flags & VMG_FLAG_JUST_EXPAND;
>
> mmap_assert_write_locked(vmg->mm);
Could we detect the wrong location by ensuring that the vma iterator is
positioned at prev?
VM_WARN_ON(just_expand && prev && prev->vm_end != vma_iter_end(vmg->vmi);
or, since vma_iter_addr is used above already..
VM_WARN_ON(just_expand && prev && prev->start != vma_iter_addr(vmg->vmi);
Does it make sense to just expand without a prev? Should that be
checked separately?
Anyways, I think it's safer to keep these checks out of the regression
fix, but maybe better to add them later?
> VM_WARN_ON(vmg->vma);
> @@ -930,7 +931,7 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
> return NULL;
>
> can_merge_left = can_vma_merge_left(vmg);
> - can_merge_right = can_vma_merge_right(vmg, can_merge_left);
> + can_merge_right = !just_expand && can_vma_merge_right(vmg, can_merge_left);
>
> /* If we can merge with the next VMA, adjust vmg accordingly. */
> if (can_merge_right) {
> @@ -953,7 +954,11 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
> if (can_merge_right && !can_merge_remove_vma(next))
> vmg->end = end;
>
> - vma_prev(vmg->vmi); /* Equivalent to going to the previous range */
> + /* In expand-only case we are already positioned at prev. */
> + if (!just_expand) {
> + /* Equivalent to going to the previous range. */
> + vma_prev(vmg->vmi);
> + }
> }
>
> /*
> @@ -967,12 +972,14 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg)
> }
>
> /* If expansion failed, reset state. Allows us to retry merge later. */
> - vmg->vma = NULL;
> - vmg->start = start;
> - vmg->end = end;
> - vmg->pgoff = pgoff;
> - if (vmg->vma == prev)
> - vma_iter_set(vmg->vmi, start);
> + if (!just_expand) {
> + vmg->vma = NULL;
> + vmg->start = start;
> + vmg->end = end;
> + vmg->pgoff = pgoff;
> + if (vmg->vma == prev)
> + vma_iter_set(vmg->vmi, start);
> + }
>
> return NULL;
> }
> diff --git a/mm/vma.h b/mm/vma.h
> index 819f994cf727..8c6ecc0dfbf6 100644
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -59,6 +59,17 @@ enum vma_merge_state {
> VMA_MERGE_SUCCESS,
> };
>
> +enum vma_merge_flags {
> + VMG_FLAG_DEFAULT = 0,
> + /*
> + * If we can expand, simply do so. We know there is nothing to merge to
> + * the right. Does not reset state upon failure to merge. The VMA
> + * iterator is assumed to be positioned at the previous VMA, rather than
> + * at the gap.
> + */
> + VMG_FLAG_JUST_EXPAND = 1 << 0,
> +};
> +
> /* Represents a VMA merge operation. */
> struct vma_merge_struct {
> struct mm_struct *mm;
> @@ -75,6 +86,7 @@ struct vma_merge_struct {
> struct mempolicy *policy;
> struct vm_userfaultfd_ctx uffd_ctx;
> struct anon_vma_name *anon_name;
> + enum vma_merge_flags merge_flags;
> enum vma_merge_state state;
> };
>
> @@ -99,6 +111,7 @@ static inline pgoff_t vma_pgoff_offset(struct vm_area_struct *vma,
> .flags = flags_, \
> .pgoff = pgoff_, \
> .state = VMA_MERGE_START, \
> + .merge_flags = VMG_FLAG_DEFAULT, \
> }
>
> #define VMG_VMA_STATE(name, vmi_, prev_, vma_, start_, end_) \
> @@ -118,6 +131,7 @@ static inline pgoff_t vma_pgoff_offset(struct vm_area_struct *vma,
> .uffd_ctx = vma_->vm_userfaultfd_ctx, \
> .anon_name = anon_vma_name(vma_), \
> .state = VMA_MERGE_START, \
> + .merge_flags = VMG_FLAG_DEFAULT, \
> }
>
> #ifdef CONFIG_DEBUG_VM_MAPLE_TREE
> --
> 2.46.2
Powered by blists - more mailing lists