[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lm2brkhp7jfqgfazr5dlz2pxvz7k4fhpfie2gddkcijwmqf3j5@rqoy7efjnpvb>
Date: Tue, 2 Jul 2024 13:24:36 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Lorenzo Stoakes <lstoakes@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Vlastimil Babka <vbabka@...e.cz>, Matthew Wilcox <willy@...radead.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
Eric Biederman <ebiederm@...ssion.com>, Kees Cook <kees@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: [RFC PATCH v2 3/7] mm: move vma_shrink(), vma_expand() to
internal header
* Lorenzo Stoakes <lstoakes@...il.com> [240628 10:35]:
> The vma_shrink() and vma_expand() functions are internal VMA manipulation
> functions which we ought to abstract for use outside of memory management
> code.
>
> To achieve this, we abstract the operation performed in fs/exec.c by
> shift_arg_pages() into a new relocate_vma() function implemented in
> mm/mmap.c, which enables us to also move move_page_tables() and
> vma_iter_prev_range() to internal.h.
>
> The purpose of doing this is to isolate key VMA manipulation functions in
> order that we can both abstract them and later render them easily testable.
>
> Signed-off-by: Lorenzo Stoakes <lstoakes@...il.com>
> ---
> fs/exec.c | 68 ++------------------------------------
> include/linux/mm.h | 17 +---------
> mm/internal.h | 18 +++++++++++
> mm/mmap.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 102 insertions(+), 82 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index 40073142288f..5cf53e20d8df 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -683,75 +683,11 @@ static int copy_strings_kernel(int argc, const char *const *argv,
> /*
> * During bprm_mm_init(), we create a temporary stack at STACK_TOP_MAX. Once
> * the binfmt code determines where the new stack should reside, we shift it to
> - * its final location. The process proceeds as follows:
> - *
> - * 1) Use shift to calculate the new vma endpoints.
> - * 2) Extend vma to cover both the old and new ranges. This ensures the
> - * arguments passed to subsequent functions are consistent.
> - * 3) Move vma's page tables to the new range.
> - * 4) Free up any cleared pgd range.
> - * 5) Shrink the vma to cover only the new range.
> + * its final location.
> */
> static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
> {
> - struct mm_struct *mm = vma->vm_mm;
> - unsigned long old_start = vma->vm_start;
> - unsigned long old_end = vma->vm_end;
> - unsigned long length = old_end - old_start;
> - unsigned long new_start = old_start - shift;
> - unsigned long new_end = old_end - shift;
> - VMA_ITERATOR(vmi, mm, new_start);
> - struct vm_area_struct *next;
> - struct mmu_gather tlb;
> -
> - BUG_ON(new_start > new_end);
> -
> - /*
> - * ensure there are no vmas between where we want to go
> - * and where we are
> - */
> - if (vma != vma_next(&vmi))
> - return -EFAULT;
> -
> - vma_iter_prev_range(&vmi);
> - /*
> - * cover the whole range: [new_start, old_end)
> - */
> - if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL))
> - return -ENOMEM;
> -
> - /*
> - * move the page tables downwards, on failure we rely on
> - * process cleanup to remove whatever mess we made.
> - */
> - if (length != move_page_tables(vma, old_start,
> - vma, new_start, length, false, true))
> - return -ENOMEM;
> -
> - lru_add_drain();
> - tlb_gather_mmu(&tlb, mm);
> - next = vma_next(&vmi);
> - if (new_end > old_start) {
> - /*
> - * when the old and new regions overlap clear from new_end.
> - */
> - free_pgd_range(&tlb, new_end, old_end, new_end,
> - next ? next->vm_start : USER_PGTABLES_CEILING);
> - } else {
> - /*
> - * otherwise, clean from old_start; this is done to not touch
> - * the address space in [new_end, old_start) some architectures
> - * have constraints on va-space that make this illegal (IA64) -
> - * for the others its just a little faster.
> - */
> - free_pgd_range(&tlb, old_start, old_end, new_end,
> - next ? next->vm_start : USER_PGTABLES_CEILING);
> - }
> - tlb_finish_mmu(&tlb);
> -
> - vma_prev(&vmi);
> - /* Shrink the vma to just the new range */
> - return vma_shrink(&vmi, vma, new_start, new_end, vma->vm_pgoff);
> + return relocate_vma(vma, shift);
> }
The end result is a function that simply returns the results of your new
function. shift_arg_pages() is used once and mentioned in a single
comment in mm/mremap.c. I wonder if it's worth just dropping the
function entirely and just replacing the call to shift_arg_pages() to
relocate_vma()?
I'm happy either way, the compiler should do the Right Thing(tm) the way
it is written.
...
> +
> +/*
> + * Relocate a VMA downwards by shift bytes. There cannot be any VMAs between
> + * this VMA and its relocated range, which will now reside at [vma->vm_start -
> + * shift, vma->vm_end - shift).
> + *
> + * This function is almost certainly NOT what you want for anything other than
> + * early executable temporary stack relocation.
> + */
> +int relocate_vma(struct vm_area_struct *vma, unsigned long shift)
The name relocate_vma() implies it could be used in any direction, but
it can only shift downwards. This is also true for the
shift_arg_pages() as well and at least the comments state which way
things are going, and that the vma is also moving.
It might be worth stating the pages are also being relocated in the
comment.
Again, I'm happy enough to keep it this way but I wanted to point it
out.
...
Thanks,
Liam
Powered by blists - more mailing lists