lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkpXNDJy+p96maDeYa7rrqHUG+Go6epdqeUog-0xabiGiw@mail.gmail.com>
Date: Tue, 23 Jan 2024 09:52:26 -0800
From: Yang Shi <shy828301@...il.com>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Rik van Riel <riel@...riel.com>, 
	Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org, 
	stable@...r.kernel.org
Subject: Re: [PATCH v1] mm: thp_get_unmapped_area must honour topdown preference

On Tue, Jan 23, 2024 at 9:14 AM Ryan Roberts <ryan.roberts@....com> wrote:
>
> The addition of commit efa7df3e3bb5 ("mm: align larger anonymous
> mappings on THP boundaries") caused the "virtual_address_range" mm
> selftest to start failing on arm64. Let's fix that regression.
>
> There were 2 visible problems when running the test; 1) it takes much
> longer to execute, and 2) the test fails. Both are related:
>
> The (first part of the) test allocates as many 1GB anonymous blocks as
> it can in the low 256TB of address space, passing NULL as the addr hint
> to mmap. Before the faulty patch, all allocations were abutted and
> contained in a single, merged VMA. However, after this patch, each
> allocation is in its own VMA, and there is a 2M gap between each VMA.
> This causes the 2 problems in the test: 1) mmap becomes MUCH slower
> because there are so many VMAs to check to find a new 1G gap. 2) mmap
> fails once it hits the VMA limit (/proc/sys/vm/max_map_count). Hitting
> this limit then causes a subsequent calloc() to fail, which causes the
> test to fail.
>
> The problem is that arm64 (unlike x86) selects
> ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT. But __thp_get_unmapped_area()
> allocates len+2M then always aligns to the bottom of the discovered gap.
> That causes the 2M hole.
>
> Fix this by detecting cases where we can still achive the alignment goal
> when moved to the top of the allocated area, if configured to prefer
> top-down allocation.
>
> While we are at it, fix thp_get_unmapped_area's use of pgoff, which
> should always be zero for anonymous mappings. Prior to the faulty
> change, while it was possible for user space to pass in pgoff!=0, the
> old mm->get_unmapped_area() handler would not use it.
> thp_get_unmapped_area() does use it, so let's explicitly zero it before
> calling the handler. This should also be the correct behavior for arches
> that define their own get_unmapped_area() handler.
>
> Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries")
> Closes: https://lore.kernel.org/linux-mm/1e8f5ac7-54ce-433a-ae53-81522b2320e1@arm.com/
> Cc: stable@...r.kernel.org
> Signed-off-by: Ryan Roberts <ryan.roberts@....com>

Thanks for debugging this. Looks good to me. Reviewed-by: Yang Shi
<shy828301@...il.com>

> ---
>
> Applies on top of v6.8-rc1. Would be good to get this into the next -rc.

This may have a conflict with my fix (" mm: huge_memory: don't force
huge page alignment on 32 bit") which is on mm-unstable now.

>
> Thanks,
> Ryan
>
>  mm/huge_memory.c | 10 ++++++++--
>  mm/mmap.c        |  6 ++++--
>  2 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 94ef5c02b459..8c66f88e71e9 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -809,7 +809,7 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
>  {
>         loff_t off_end = off + len;
>         loff_t off_align = round_up(off, size);
> -       unsigned long len_pad, ret;
> +       unsigned long len_pad, ret, off_sub;
>
>         if (off_end <= off_align || (off_end - off_align) < size)
>                 return 0;
> @@ -835,7 +835,13 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
>         if (ret == addr)
>                 return addr;
>
> -       ret += (off - ret) & (size - 1);
> +       off_sub = (off - ret) & (size - 1);
> +
> +       if (current->mm->get_unmapped_area == arch_get_unmapped_area_topdown &&
> +           !off_sub)
> +               return ret + size;
> +
> +       ret += off_sub;
>         return ret;
>  }
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index b78e83d351d2..d89770eaab6b 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1825,15 +1825,17 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>                 /*
>                  * mmap_region() will call shmem_zero_setup() to create a file,
>                  * so use shmem's get_unmapped_area in case it can be huge.
> -                * do_mmap() will clear pgoff, so match alignment.
>                  */
> -               pgoff = 0;
>                 get_area = shmem_get_unmapped_area;
>         } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
>                 /* Ensures that larger anonymous mappings are THP aligned */
>                 get_area = thp_get_unmapped_area;
>         }
>
> +       /* Always treat pgoff as zero for anonymous memory. */
> +       if (!file)
> +               pgoff = 0;
> +
>         addr = get_area(file, addr, len, pgoff, flags);
>         if (IS_ERR_VALUE(addr))
>                 return addr;
> --
> 2.25.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ