[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABi2SkUBH+utreVMTd1qEBSvXTPM7Qc1GiwNKjVa9+EeZK8WWA@mail.gmail.com>
Date: Wed, 21 Aug 2024 09:15:52 -0700
From: Jeff Xu <jeffxu@...omium.org>
To: Pedro Falcato <pedro.falcato@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Shuah Khan <shuah@...nel.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, oliver.sang@...el.com,
torvalds@...ux-foundation.org, Michael Ellerman <mpe@...erman.id.au>,
Kees Cook <kees@...nel.org>
Subject: Re: [PATCH v3 2/7] mm/munmap: Replace can_modify_mm with can_modify_vma
On Fri, Aug 16, 2024 at 5:18 PM Pedro Falcato <pedro.falcato@...il.com> wrote:
>
> We were doing an extra mmap tree traversal just to check if the entire
> range is modifiable. This can be done when we iterate through the VMAs
> instead.
>
> Signed-off-by: Pedro Falcato <pedro.falcato@...il.com>
> ---
> mm/mmap.c | 11 +----------
> mm/vma.c | 19 ++++++++++++-------
> 2 files changed, 13 insertions(+), 17 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 3af256bacef3..30ae4cb5cec9 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1740,16 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> unsigned long start, unsigned long end, struct list_head *uf,
> bool unlock)
> {
> - struct mm_struct *mm = vma->vm_mm;
> -
> - /*
> - * Check if memory is sealed, prevent unmapping a sealed VMA.
> - * can_modify_mm assumes we have acquired the lock on MM.
> - */
> - if (unlikely(!can_modify_mm(mm, start, end)))
> - return -EPERM;
Another approach to improve perf is to clone the vmi (since it
already point to the first vma), and pass the cloned vmi/vma into
can_modify_mm check, that will remove the cost of re-finding the first
VMA.
The can_modify_mm then continues from cloned VMI/vma till the end of
address range, there will be some perf cost there. However, most
address ranges in the real world are within a single VMA, in
practice, the perf cost is the same as checking the single VMA, 99.9%
case.
This will help preserve the nice sealing feature (if one of the vma is
sealed, the entire address range is not modified)
> -
> - return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> + return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
> }
>
> /*
> diff --git a/mm/vma.c b/mm/vma.c
> index 84965f2cd580..5850f7c0949b 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
> goto map_count_exceeded;
>
> + /* Don't bother splitting the VMA if we can't unmap it anyway */
> + if (!can_modify_vma(vma)) {
> + error = -EPERM;
> + goto start_split_failed;
> + }
> +
> error = __split_vma(vmi, vma, start, 1);
> if (error)
> goto start_split_failed;
> @@ -723,6 +729,11 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> */
> next = vma;
> do {
> + if (!can_modify_vma(next)) {
> + error = -EPERM;
> + goto modify_vma_failed;
> + }
> +
> /* Does it split the end? */
> if (next->vm_end > end) {
> error = __split_vma(vmi, next, end, 0);
> @@ -815,6 +826,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> __mt_destroy(&mt_detach);
> return 0;
>
> +modify_vma_failed:
> clear_tree_failed:
> userfaultfd_error:
> munmap_gather_failed:
> @@ -860,13 +872,6 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
> if (end == start)
> return -EINVAL;
>
> - /*
> - * Check if memory is sealed, prevent unmapping a sealed VMA.
> - * can_modify_mm assumes we have acquired the lock on MM.
> - */
> - if (unlikely(!can_modify_mm(mm, start, end)))
> - return -EPERM;
> -
> /* Find the first overlapping VMA */
> vma = vma_find(vmi, end);
> if (!vma) {
>
> --
> 2.46.0
>
Powered by blists - more mailing lists