[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <99922cca-ed8e-4996-8833-a89264783b28@suse.cz>
Date: Mon, 10 Mar 2025 16:11:59 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: "Liam R . Howlett" <Liam.Howlett@...cle.com>, Jann Horn
<jannh@...gle.com>, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Harry Yoo <harry.yoo@...cle.com>, Yosry Ahmed <yosry.ahmed@...ux.dev>
Subject: Re: [PATCH v2 5/7] mm/mremap: complete refactor of move_vma()
On 3/6/25 11:34, Lorenzo Stoakes wrote:
> We invoke ksm_madvise() with an intentionally dummy flags field, so no
> need to pass around.
>
> Additionally, the code tries to be 'clever' with account_start,
> account_end, using these to both check that vma->vm_start != 0 and that we
> ought to account the newly split portion of VMA post-move, either before
> or after it.
>
> We need to do this because we intentionally removed VM_ACCOUNT on the VMA
> prior to unmapping, so we don't erroneously unaccount memory (we have
> already calculated the correct amount to account and accounted it, any
> subsequent subtraction will be incorrect).
>
> This patch significantly expands the comment (from 2002!) about
> 'concealing' the flag to make it abundantly clear what's going on, as well
> as adding and expanding a number of other comments also.
>
> We can remove account_start, account_end by instead tracking when we
> account (i.e. vma->vm_flags has the VM_ACCOUNT flag set, and this is not
> an MREMAP_DONTUNMAP operation), and figuring out when to reinstate the
> VM_ACCOUNT flag on prior/subsequent VMAs separately.
>
> We additionally break the function into logical pieces and attack the very
> confusing error handling logic (where, for instance, new_addr is set to
> err).
>
> After this change the code is considerably more readable and easy to
> manipulate.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Reviewed-by: Vlastimil Babka <vbabka@...e.cz>
Some nits below:
> +/*
> + * Unmap source VMA for VMA move, turning it from a copy to a move, being
> + * careful to ensure we do not underflow memory account while doing so if an
> + * accountable move.
> + *
> + * This is best effort, if we fail to unmap then we simply try
this looks like an unfinished sentence?
> @@ -1007,51 +1157,15 @@ static unsigned long move_vma(struct vma_remap_struct *vrm)
> */
> hiwater_vm = mm->hiwater_vm;
This...
> - /* Tell pfnmap has moved from this vma */
> - if (unlikely(vma->vm_flags & VM_PFNMAP))
> - untrack_pfn_clear(vma);
> -
> - if (unlikely(!err && (vrm->flags & MREMAP_DONTUNMAP))) {
> - /* We always clear VM_LOCKED[ONFAULT] on the old vma */
> - vm_flags_clear(vma, VM_LOCKED_MASK);
> -
> - /*
> - * anon_vma links of the old vma is no longer needed after its page
> - * table has been moved.
> - */
> - if (new_vma != vma && vma->vm_start == old_addr &&
> - vma->vm_end == (old_addr + old_len))
> - unlink_anon_vmas(vma);
> -
> - /* Because we won't unmap we don't need to touch locked_vm */
> - vrm_stat_account(vrm, new_len);
> - return new_addr;
> - }
> -
> - vrm_stat_account(vrm, new_len);
> -
> - vma_iter_init(&vmi, mm, old_addr);
> - if (do_vmi_munmap(&vmi, mm, old_addr, old_len, vrm->uf_unmap, false) < 0) {
> - /* OOM: unable to split vma, just get accounts right */
> - if (vm_flags & VM_ACCOUNT && !(vrm->flags & MREMAP_DONTUNMAP))
> - vm_acct_memory(old_len >> PAGE_SHIFT);
> - account_start = account_end = false;
> - }
> + vrm_stat_account(vrm, vrm->new_len);
> + if (unlikely(!err && (vrm->flags & MREMAP_DONTUNMAP)))
> + dontunmap_complete(vrm, new_vma);
> + else
> + unmap_source_vma(vrm);
>
> mm->hiwater_vm = hiwater_vm;
... and this AFAICS only applies to the unmap_source_vma() case. And from
the comment it seems only to the "undo destination vma" variant. BTW this
also means the unmap_source_vma() name is rather misleading?
So I think the whole handling of that hiwater_vm could move to
unmap_source_vma(). It should not matter if we take the snapshot of
hiwater_vm only after vrm_stat_account() bumps the total_vm. Nobody else can
be racing with us to permanently turn that total_vm to a hiwater_vm.
(this is probably a potential cleanup for a followup-patch anyway)
>
> - /* Restore VM_ACCOUNT if one or two pieces of vma left */
> - if (account_start) {
> - vma = vma_prev(&vmi);
> - vm_flags_set(vma, VM_ACCOUNT);
> - }
> -
> - if (account_end) {
> - vma = vma_next(&vmi);
> - vm_flags_set(vma, VM_ACCOUNT);
> - }
> -
> - return new_addr;
> + return err ? (unsigned long)err : vrm->new_addr;
> }
>
> /*
> --
> 2.48.1
Powered by blists - more mailing lists