[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7ea1f225-0fd3-4d8e-8618-a275e0797345@lucifer.local>
Date: Wed, 23 Oct 2024 19:20:44 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>,
Jann Horn <jannh@...gle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Xu <peterx@...hat.com>
Subject: Re: [PATCH hotfix 6.12 8/8] mm: do not attempt second merge for
file-backed VMAs
On Wed, Oct 23, 2024 at 02:16:19PM -0400, Liam R. Howlett wrote:
> * Lorenzo Stoakes <lorenzo.stoakes@...cle.com> [241023 11:16]:
> > On Wed, Oct 23, 2024 at 05:01:29PM +0200, Vlastimil Babka wrote:
> > > On 10/22/24 22:40, Lorenzo Stoakes wrote:
> > > > Previously, we'd always try to merge a file-backed VMA if its flags were
> > > > changed by the driver.
> > > >
> > > > This however is rarely meaningful as typically the flags would be changed
> > > > to VM_PFNMAP or other VM_SPECIAL flags which are inherently unmergable.
> > > >
> > > > In cases where it is meaningful (for instance DAX) it is doubtful that this
> > >
> > > Hm if that's true, I'm imagining many piecemeal mmap()s of DAX that used to
> > > merge but now will create tons of VMA's, which doesn't sound great. Then it
> > > has also potentially breaking effects on mremap() which doesn't work accross
> > > multiple VMA's.
> >
> > I said this repeatedly to you and Liam but you both still seemed to want
> > this :))
> >
> > Anyway, yes. I mean you're pretty crazy if you are mapping a bunch of
> > adjacent DAX ranges that are all otherwise mergeable next to one another,
> > then on that basis assuming that you can mremap() the whole thing.
>
> Today, this will only work if there is a previous vma, otherwise we skip
> the attempt to merge. It's probably a safe bet that there is a previous
> vma, but it still means relying on this merging is broken today.
True, but I also have a cunning plan, based on what Vlasta was getting at
with another idea that was possibly not quite as cunning that means we
might be able to still do this but in a much less terrible way...
>
> >
> > >
> > > > optimisation is worth the effort and maintenance risk of having to unwind
> > > > state and perform a merge.
> > >
> > > What if we simply created a new vma but only then checked if the flags
> > > changed and we can merge it with neighbours (i.e. like the mprotect()
> > > merging case). Less efficient, but less tricky and with the same result
> > > hopefully?
> >
> > I'd probably rather just drop this idea rather than wade into something
> > entirely new, but let me look at whether we can leverage
> > vma_modify_flags().
> >
> > I have a feeling we can't because we already explicitly reset state in the
> > merge new VMA case, and we'd be introducing a new way in which state could
> > get mangled.
> >
> > But I'll take a look and see, otherwise we can just drop this for now and
> > potentially come back to it later, the key bit of the non-backport patches
> > are 5-7 anyway.
> >
> > >
> > > > Since we've observed bugs and resource leaks due to complexity in this
> > > > area, it is simply not acceptable to have a 'nice to have' optimisation
> > > > like this complicating an already very complicated code path, so let's
> > > > simply eliminate it.
> > > >
> > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> > > > ---
> > > > mm/vma.c | 39 ++-------------------------------------
> > > > 1 file changed, 2 insertions(+), 37 deletions(-)
> > > >
> > > > diff --git a/mm/vma.c b/mm/vma.c
> > > > index a271e2b406ab..fe1fe5099e78 100644
> > > > --- a/mm/vma.c
> > > > +++ b/mm/vma.c
> > > > @@ -2260,8 +2260,7 @@ static int __mmap_prepare(struct mmap_state *map)
> > > > return 0;
> > > > }
> > > >
> > > > -static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vma,
> > > > - struct vm_area_struct **mergep)
> > > > +static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vma)
> > > > {
> > > > struct vma_iterator *vmi = map->vmi;
> > > > struct vma_merge_struct *vmg = map->vmg;
> > > > @@ -2291,34 +2290,6 @@ static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vm
> > > > (vma->vm_flags & VM_MAYWRITE));
> > > >
> > > > vma_iter_config(vmi, vmg->start, vmg->end);
> > > > - /*
> > > > - * If flags changed after mmap_file(), we should try merge
> > > > - * vma again as we may succeed this time.
> > > > - */
> > > > - if (unlikely(map->flags != vma->vm_flags && vmg->prev)) {
> > > > - struct vm_area_struct *merge;
> > > > -
> > > > - vmg->flags = vma->vm_flags;
> > > > - /* If this fails, state is reset ready for a reattempt. */
> > > > - merge = vma_merge_new_range(vmg);
> > > > -
> > > > - if (merge) {
> > > > - /*
> > > > - * ->mmap() can change vma->vm_file and fput
> > > > - * the original file. So fput the vma->vm_file
> > > > - * here or we would add an extra fput for file
> > > > - * and cause general protection fault
> > > > - * ultimately.
> > > > - */
> > > > - fput(vma->vm_file);
> > > > - vm_area_free(vma);
> > > > - vma_iter_free(vmi);
> > > > - *mergep = merge;
> > > > - } else {
> > > > - vma_iter_config(vmi, vmg->start, vmg->end);
> > > > - }
> > > > - }
> > > > -
> > > > map->flags = vma->vm_flags;
> > > > return 0;
> > > > }
> > > > @@ -2341,7 +2312,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
> > > > {
> > > > struct vma_iterator *vmi = map->vmi;
> > > > struct vma_merge_struct *vmg = map->vmg;
> > > > - struct vm_area_struct *merge = NULL;
> > > > int error = 0;
> > > > struct vm_area_struct *vma;
> > > >
> > > > @@ -2365,7 +2335,7 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
> > > > }
> > > >
> > > > if (vmg->file)
> > > > - error = __mmap_new_file_vma(map, vma, &merge);
> > > > + error = __mmap_new_file_vma(map, vma);
> > > > else if (map->flags & VM_SHARED)
> > > > error = shmem_zero_setup(vma);
> > > > else
> > > > @@ -2374,9 +2344,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
> > > > if (error)
> > > > goto free_iter_vma;
> > > >
> > > > - if (merge)
> > > > - goto file_expanded;
> > > > -
> > > > #ifdef CONFIG_SPARC64
> > > > /* TODO: Fix SPARC ADI! */
> > > > WARN_ON_ONCE(!arch_validate_flags(map->flags));
> > > > @@ -2393,8 +2360,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
> > > > * call covers the non-merge case.
> > > > */
> > > > khugepaged_enter_vma(vma, map->flags);
> > > > -
> > > > -file_expanded:
> > > > ksm_add_vma(vma);
> > > >
> > > > *vmap = vma;
> > > > --
> > > > 2.47.0
> > >
Powered by blists - more mailing lists