[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z9hycm5JEAcGFrd2@x1.local>
Date: Mon, 17 Mar 2025 15:05:22 -0400
From: Peter Xu <peterx@...hat.com>
To: Marty Kareem <MartyKareem@...look.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, shuah@...nel.org
Subject: Re: [PATCH RESEND] mm/selftest: Replace static BASE_PMD_ADDR with
dynamic address allocation
Hello, Marty,
On Thu, Mar 13, 2025 at 10:35:35PM -0400, Marty Kareem wrote:
> (RESEND: previous email accidentally sent in HTML format, resending in plain
> text)
Yes, plan text is better, but when you repost again please send the patch
directly instead of making it an attachment. See:
https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html
You could leverage git format-patch or git send-patch.
>
> This patch addresses a longstanding TODO comment in the userfaultfd tests,
> '/linux/tools/testing/selftests/mm/uffd-common.c'
> ("/* Use a static addr is ugly */") by replacing hardcoded 1GB addresses
> with dynamic allocation. I'd appreciate your review.
>
> The Problem
> ------------
> The current static address approach:
> - Causes test failures when other mappings occupy the 1GB region
> - Prevents parallel test execution (critical for modern CI/CD systems)
> - Breaks on systems with unusual memory layouts
Yes, I believe this is a real selftest issue.
>
> The Solution
> ------------
> I implemented a find_suitable_area() helper that:
> - Asks the kernel for suggested addresses via mmap(NULL)
> - Automatically aligns for huge pages when needed
> - Uses MAP_FIXED_NOREPLACE where available (graceful fallback otherwise)
> - Adds guard pages between mappings to prevent VMA merging
>
> Validation
> ----------
> I did multiple tests on my implementation to make sure it worked like:
> - Multiple parallel test runs
> - Memory pressure scenarios
> - Edge cases (unusual alignments, sizes, etc.)
> - Race conditions and concurrent access
>
> Performance impact is minimal , about 1.2x overhead compared to the static
> approach in benchmarks.
>
> Why This Matters
> ----------------
> - Removes longstanding TODO from the codebase
> - Enables safe parallel testing
> - Makes tests portable to different environments and memory layouts
> - Improves overall test reliability
>
> This is my first PR for the Linux Kernel and I would be
> grateful for your feedback!
>
> Signed-off-by: MrMartyK <martykareem@...look.com>
> From 5295ee5a7532f1e75364cefa1091dfb49ad320d4 Mon Sep 17 00:00:00 2001
> From: MrMartyK <martykareem@...look.com>
If you want, you may fill this up with your real full name. But it's your
call.
> Date: Thu, 13 Mar 2025 20:17:43 -0400
> Subject: [PATCH] mm/selftest: Replace static BASE_PMD_ADDR with dynamic
> address allocation
>
> This commit replaces the static BASE_PMD_ADDR in userfaultfd tests with
> dynamic address discovery using the find_suitable_area() function. This
> addresses a TODO comment in the code which noted that using a static
> address was 'ugly'.
>
> The implementation:
> 1. Adds find_suitable_area() that obtains a good address hint from the OS
> 2. Updates shmem_allocate_area() to use dynamic allocation
> 3. Includes proper fallback mechanisms when preferred addresses unavailable
> 4. Works with both MAP_FIXED_NOREPLACE and MAP_FIXED
> 5. Maintains backward compatibility with existing tests
>
> This provides more robust operation when running tests in parallel or in
> environments with different memory layouts, while maintaining good
> performance (only ~1.2x overhead vs. the static approach).
>
> Additional updates:
> - Added improved code formatting and comments
> - Enhanced error handling in fallback cases
> - Fixed memory verification in integration tests
>
> Signed-off-by: MrMartyK <martykareem@...look.com>
> ---
> tools/testing/selftests/mm/uffd-common.c | 114 ++++++++++++++++++-----
> 1 file changed, 93 insertions(+), 21 deletions(-)
>
> diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selftests/mm/uffd-common.c
> index 717539eddf98..b91510da494e 100644
> --- a/tools/testing/selftests/mm/uffd-common.c
> +++ b/tools/testing/selftests/mm/uffd-common.c
> @@ -7,7 +7,7 @@
>
> #include "uffd-common.h"
>
> -#define BASE_PMD_ADDR ((void *)(1UL << 30))
> +// Removed static BASE_PMD_ADDR definition in favor of dynamic address allocation
>
> volatile bool test_uffdio_copy_eexist = true;
> unsigned long nr_cpus, nr_pages, nr_pages_per_cpu, page_size;
> @@ -122,6 +122,39 @@ static void shmem_release_pages(char *rel_area)
> err("madvise(MADV_REMOVE) failed");
> }
>
> +/**
> + * Find a suitable virtual address area of the requested size and alignment
> + *
> + * This function obtains a hint from the OS about where a good place to map
> + * memory might be, then aligns it according to the specified alignment
> + * requirements.
> + *
> + * @param size Size of the memory area needed
> + * @param alignment Alignment requirement (typically huge page size)
> + * @return A suitable address or NULL if none could be found
> + */
> +static void *find_suitable_area(size_t size, size_t alignment)
> +{
> + void *addr;
> + void *hint;
> + uintptr_t aligned_addr;
> +
> + /* First try to get a suggestion from the OS */
> + addr = mmap(NULL, size, PROT_NONE,
> + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> + if (addr == MAP_FAILED)
> + return NULL;
> +
> + /* Remember the address and unmap it */
> + hint = addr;
> + munmap(addr, size);
Above PROT_NONE private map trick looks good. But right after munmap(),
another thread could take this range away instead using another concurrent
mmap(), because the kernel (after munmap() returned here) would think this
area free.
To make it not possible to happen, IIUC the general way to do this is
keeping the pointer without munmap() here, then mmap() with MAP_FIXED the
2nd time directly on top of it, causing atomic unmap of the PROT_NONE
range, replacing it with the new mmap() you really need. Before the 2nd
mmap(), nothing should be able to race taking that region because in the
kernel this range is still occupied.
With that, IIUC we also don't need MAP_FIXED_NOREPLACE, because such
behavior should exist forever, and it should work on both old/new kernels.
> +
> + /* Align the address to requested alignment (e.g., huge page size) */
> + aligned_addr = ((uintptr_t)hint + alignment - 1) & ~(alignment - 1);
> +
> + return (void *)aligned_addr;
> +}
> +
> static int shmem_allocate_area(void **alloc_area, bool is_src)
> {
> void *area_alias = NULL;
> @@ -129,35 +162,74 @@ static int shmem_allocate_area(void **alloc_area, bool is_src)
> unsigned long offset = is_src ? 0 : bytes;
> char *p = NULL, *p_alias = NULL;
> int mem_fd = uffd_mem_fd_create(bytes * 2, false);
> + int map_flags;
>
> - /* TODO: clean this up. Use a static addr is ugly */
> - p = BASE_PMD_ADDR;
> - if (!is_src)
> + /* Find a suitable address range with appropriate alignment */
> + p = find_suitable_area(bytes, hpage_size);
> + if (!p) {
> + /* Fallback strategy if finding area fails */
> + fprintf(stderr, "Warning: Could not find suitable address, letting kernel choose\n");
> + }
> +
> + /* If this is dst area, add offset to prevent overlap with src area */
> + if (!is_src && p) {
> + /* Use same spacing as original code to maintain compatibility */
> /* src map + alias + interleaved hpages */
> - p += 2 * (bytes + hpage_size);
> - p_alias = p;
> - p_alias += bytes;
> - p_alias += hpage_size; /* Prevent src/dst VMA merge */
> + p = (char *)p + 2 * (bytes + hpage_size);
> + }
>
> - *alloc_area = mmap(p, bytes, PROT_READ | PROT_WRITE, MAP_SHARED,
> - mem_fd, offset);
> + /* Select flags based on whether we have a preferred address */
> + map_flags = MAP_SHARED;
> + if (p) {
> +#ifdef MAP_FIXED_NOREPLACE
> + map_flags |= MAP_FIXED_NOREPLACE;
> +#else
> + /* Fallback to regular MAP_FIXED if necessary */
> + map_flags |= MAP_FIXED;
> +#endif
> + }
> +
> + /* Try to map at the suggested address, falling back if needed */
> + *alloc_area = mmap(p, bytes, PROT_READ | PROT_WRITE, map_flags, mem_fd, offset);
> +
> if (*alloc_area == MAP_FAILED) {
> - *alloc_area = NULL;
> - return -errno;
> + /* If fixed mapping failed, try again without it */
> + *alloc_area = mmap(NULL, bytes, PROT_READ | PROT_WRITE,
> + MAP_SHARED, mem_fd, offset);
> + if (*alloc_area == MAP_FAILED) {
> + *alloc_area = NULL;
> + close(mem_fd);
> + return -errno;
> + }
> }
> - if (*alloc_area != p)
> - err("mmap of memfd failed at %p", p);
>
> - area_alias = mmap(p_alias, bytes, PROT_READ | PROT_WRITE, MAP_SHARED,
> - mem_fd, offset);
> + /* Calculate a good spot for the alias mapping with space to prevent merging */
> + p_alias = (char *)((uintptr_t)*alloc_area + bytes + hpage_size);
> +
> + /* Map the alias area */
> + map_flags = MAP_SHARED;
> +#ifdef MAP_FIXED_NOREPLACE
> + map_flags |= MAP_FIXED_NOREPLACE;
> +#else
> + map_flags |= MAP_FIXED;
> +#endif
> +
> + area_alias = mmap(p_alias, bytes, PROT_READ | PROT_WRITE,
> + map_flags, mem_fd, offset);
> +
> if (area_alias == MAP_FAILED) {
> - munmap(*alloc_area, bytes);
> - *alloc_area = NULL;
> - return -errno;
> + /* If fixed mapping failed, try anywhere */
> + area_alias = mmap(NULL, bytes, PROT_READ | PROT_WRITE,
> + MAP_SHARED, mem_fd, offset);
> + if (area_alias == MAP_FAILED) {
> + munmap(*alloc_area, bytes);
> + *alloc_area = NULL;
> + close(mem_fd);
> + return -errno;
> + }
> }
> - if (area_alias != p_alias)
> - err("mmap of anonymous memory failed at %p", p_alias);
>
> + /* Store the alias mapping for later use */
> if (is_src)
> area_src_alias = area_alias;
> else
> --
> 2.43.0
>
--
Peter Xu
Powered by blists - more mailing lists