lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <LV3P220MB1815A1451D408BB789C4387BBAD92@LV3P220MB1815.NAMP220.PROD.OUTLOOK.COM>
Date: Tue, 18 Mar 2025 22:35:33 -0400
From: Marty Kareem <MartyKareem@...look.com>
To: Peter Xu <peterx@...hat.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 akpm@...ux-foundation.org, shuah@...nel.org
Subject: Re: [PATCH v2] [PATCH RESEND] mm/selftest: Replace static
 BASE_PMD_ADDR with dynamic address allocation

Hi Peter,

Thank you for your detailed review and suggestions.

I've updated the patch to address the race condition you highlighted. In this v2 patch, the following changes have been made:

1. The PROT_NONE reservation is maintained until it can be atomically replaced.

2. I now use MAP_FIXED to atomically replace the reservation with the intended mapping.

3. The MAP_FIXED_NOREPLACE conditionals have been removed, as the atomic replacement works reliably across all kernel versions.

4. The overall implementation has been simplified while ensuring robustness.

These modifications ensure that the memory region remains reserved until it is atomically replaced, effectively eliminating the race window and improving test reliability—especially in parallel test environments. Benchmark results show only a minimal performance impact (approximately 1.3x overhead vs. static addressing), and all tests pass successfully.

One note: I'm currently having some issues with git send-email and my Outlook account, so I'm sending this patch through Thunderbird as a plain text attachment. I’m working on resolving the git send-email setup for future submissions.

Thanks again for your valuable feedback.

Best regards,

Marty Kareem

On 3/17/25 15:05, Peter Xu wrote:
> Hello, Marty,
>
> On Thu, Mar 13, 2025 at 10:35:35PM -0400, Marty Kareem wrote:
>> (RESEND: previous email accidentally sent in HTML format, resending in plain
>> text)
> Yes, plan text is better, but when you repost again please send the patch
> directly instead of making it an attachment.  See:
>
>    https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html
>
> You could leverage git format-patch or git send-patch.
>
>> This patch addresses a longstanding TODO comment in the userfaultfd tests,
>> '/linux/tools/testing/selftests/mm/uffd-common.c'
>> ("/* Use a static addr is ugly */") by replacing hardcoded 1GB addresses
>> with dynamic allocation. I'd appreciate your review.
>>
>> The Problem
>> ------------
>> The current static address approach:
>> - Causes test failures when other mappings occupy the 1GB region
>> - Prevents parallel test execution (critical for modern CI/CD systems)
>> - Breaks on systems with unusual memory layouts
> Yes, I believe this is a real selftest issue.
>
>> The Solution
>> ------------
>> I implemented a find_suitable_area() helper that:
>> - Asks the kernel for suggested addresses via mmap(NULL)
>> - Automatically aligns for huge pages when needed
>> - Uses MAP_FIXED_NOREPLACE where available (graceful fallback otherwise)
>> - Adds guard pages between mappings to prevent VMA merging
>>
>> Validation
>> ----------
>> I did multiple tests on my implementation to make sure it worked like:
>> - Multiple parallel test runs
>> - Memory pressure scenarios
>> - Edge cases (unusual alignments, sizes, etc.)
>> - Race conditions and concurrent access
>>
>> Performance impact is minimal , about 1.2x overhead compared to the static
>> approach in benchmarks.
>>
>> Why This Matters
>> ----------------
>> - Removes longstanding TODO from the codebase
>> - Enables safe parallel testing
>> - Makes tests portable to different environments and memory layouts
>> - Improves overall test reliability
>>
>> This is my first PR for the Linux Kernel and I would be
>> grateful for your feedback!
>>
>> Signed-off-by: MrMartyK <martykareem@...look.com>
>>  From 5295ee5a7532f1e75364cefa1091dfb49ad320d4 Mon Sep 17 00:00:00 2001
>> From: MrMartyK <martykareem@...look.com>
> If you want, you may fill this up with your real full name.  But it's your
> call.
>
>> Date: Thu, 13 Mar 2025 20:17:43 -0400
>> Subject: [PATCH] mm/selftest: Replace static BASE_PMD_ADDR with dynamic
>>   address allocation
>>
>> This commit replaces the static BASE_PMD_ADDR in userfaultfd tests with
>> dynamic address discovery using the find_suitable_area() function. This
>> addresses a TODO comment in the code which noted that using a static
>> address was 'ugly'.
>>
>> The implementation:
>> 1. Adds find_suitable_area() that obtains a good address hint from the OS
>> 2. Updates shmem_allocate_area() to use dynamic allocation
>> 3. Includes proper fallback mechanisms when preferred addresses unavailable
>> 4. Works with both MAP_FIXED_NOREPLACE and MAP_FIXED
>> 5. Maintains backward compatibility with existing tests
>>
>> This provides more robust operation when running tests in parallel or in
>> environments with different memory layouts, while maintaining good
>> performance (only ~1.2x overhead vs. the static approach).
>>
>> Additional updates:
>> - Added improved code formatting and comments
>> - Enhanced error handling in fallback cases
>> - Fixed memory verification in integration tests
>>
>> Signed-off-by: MrMartyK <martykareem@...look.com>
>> ---
>>   tools/testing/selftests/mm/uffd-common.c | 114 ++++++++++++++++++-----
>>   1 file changed, 93 insertions(+), 21 deletions(-)
>>
>> diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selftests/mm/uffd-common.c
>> index 717539eddf98..b91510da494e 100644
>> --- a/tools/testing/selftests/mm/uffd-common.c
>> +++ b/tools/testing/selftests/mm/uffd-common.c
>> @@ -7,7 +7,7 @@
>>   
>>   #include "uffd-common.h"
>>   
>> -#define BASE_PMD_ADDR ((void *)(1UL << 30))
>> +// Removed static BASE_PMD_ADDR definition in favor of dynamic address allocation
>>   
>>   volatile bool test_uffdio_copy_eexist = true;
>>   unsigned long nr_cpus, nr_pages, nr_pages_per_cpu, page_size;
>> @@ -122,6 +122,39 @@ static void shmem_release_pages(char *rel_area)
>>   		err("madvise(MADV_REMOVE) failed");
>>   }
>>   
>> +/**
>> + * Find a suitable virtual address area of the requested size and alignment
>> + *
>> + * This function obtains a hint from the OS about where a good place to map
>> + * memory might be, then aligns it according to the specified alignment
>> + * requirements.
>> + *
>> + * @param size      Size of the memory area needed
>> + * @param alignment Alignment requirement (typically huge page size)
>> + * @return          A suitable address or NULL if none could be found
>> + */
>> +static void *find_suitable_area(size_t size, size_t alignment)
>> +{
>> +	void *addr;
>> +	void *hint;
>> +	uintptr_t aligned_addr;
>> +	
>> +	/* First try to get a suggestion from the OS */
>> +	addr = mmap(NULL, size, PROT_NONE,
>> +	            MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>> +	if (addr == MAP_FAILED)
>> +		return NULL;
>> +	
>> +	/* Remember the address and unmap it */
>> +	hint = addr;
>> +	munmap(addr, size);
> Above PROT_NONE private map trick looks good.  But right after munmap(),
> another thread could take this range away instead using another concurrent
> mmap(), because the kernel (after munmap() returned here) would think this
> area free.
>
> To make it not possible to happen, IIUC the general way to do this is
> keeping the pointer without munmap() here, then mmap() with MAP_FIXED the
> 2nd time directly on top of it, causing atomic unmap of the PROT_NONE
> range, replacing it with the new mmap() you really need.  Before the 2nd
> mmap(), nothing should be able to race taking that region because in the
> kernel this range is still occupied.
>
> With that, IIUC we also don't need MAP_FIXED_NOREPLACE, because such
> behavior should exist forever, and it should work on both old/new kernels.
>
>> +	
>> +	/* Align the address to requested alignment (e.g., huge page size) */
>> +	aligned_addr = ((uintptr_t)hint + alignment - 1) & ~(alignment - 1);
>> +	
>> +	return (void *)aligned_addr;
>> +}
>> +
>>   static int shmem_allocate_area(void **alloc_area, bool is_src)
>>   {
>>   	void *area_alias = NULL;
>> @@ -129,35 +162,74 @@ static int shmem_allocate_area(void **alloc_area, bool is_src)
>>   	unsigned long offset = is_src ? 0 : bytes;
>>   	char *p = NULL, *p_alias = NULL;
>>   	int mem_fd = uffd_mem_fd_create(bytes * 2, false);
>> +	int map_flags;
>>   
>> -	/* TODO: clean this up.  Use a static addr is ugly */
>> -	p = BASE_PMD_ADDR;
>> -	if (!is_src)
>> +	/* Find a suitable address range with appropriate alignment */
>> +	p = find_suitable_area(bytes, hpage_size);
>> +	if (!p) {
>> +		/* Fallback strategy if finding area fails */
>> +		fprintf(stderr, "Warning: Could not find suitable address, letting kernel choose\n");
>> +	}
>> +
>> +	/* If this is dst area, add offset to prevent overlap with src area */
>> +	if (!is_src && p) {
>> +		/* Use same spacing as original code to maintain compatibility */
>>   		/* src map + alias + interleaved hpages */
>> -		p += 2 * (bytes + hpage_size);
>> -	p_alias = p;
>> -	p_alias += bytes;
>> -	p_alias += hpage_size;  /* Prevent src/dst VMA merge */
>> +		p = (char *)p + 2 * (bytes + hpage_size);
>> +	}
>>   
>> -	*alloc_area = mmap(p, bytes, PROT_READ | PROT_WRITE, MAP_SHARED,
>> -			   mem_fd, offset);
>> +	/* Select flags based on whether we have a preferred address */
>> +	map_flags = MAP_SHARED;
>> +	if (p) {
>> +#ifdef MAP_FIXED_NOREPLACE
>> +		map_flags |= MAP_FIXED_NOREPLACE;
>> +#else
>> +		/* Fallback to regular MAP_FIXED if necessary */
>> +		map_flags |= MAP_FIXED;
>> +#endif
>> +	}
>> +
>> +	/* Try to map at the suggested address, falling back if needed */
>> +	*alloc_area = mmap(p, bytes, PROT_READ | PROT_WRITE, map_flags, mem_fd, offset);
>> +	
>>   	if (*alloc_area == MAP_FAILED) {
>> -		*alloc_area = NULL;
>> -		return -errno;
>> +		/* If fixed mapping failed, try again without it */
>> +		*alloc_area = mmap(NULL, bytes, PROT_READ | PROT_WRITE,
>> +				  MAP_SHARED, mem_fd, offset);
>> +		if (*alloc_area == MAP_FAILED) {
>> +			*alloc_area = NULL;
>> +			close(mem_fd);
>> +			return -errno;
>> +		}
>>   	}
>> -	if (*alloc_area != p)
>> -		err("mmap of memfd failed at %p", p);
>>   
>> -	area_alias = mmap(p_alias, bytes, PROT_READ | PROT_WRITE, MAP_SHARED,
>> -			  mem_fd, offset);
>> +	/* Calculate a good spot for the alias mapping with space to prevent merging */
>> +	p_alias = (char *)((uintptr_t)*alloc_area + bytes + hpage_size);
>> +
>> +	/* Map the alias area */
>> +	map_flags = MAP_SHARED;
>> +#ifdef MAP_FIXED_NOREPLACE
>> +	map_flags |= MAP_FIXED_NOREPLACE;
>> +#else
>> +	map_flags |= MAP_FIXED;
>> +#endif
>> +
>> +	area_alias = mmap(p_alias, bytes, PROT_READ | PROT_WRITE,
>> +			  map_flags, mem_fd, offset);
>> +
>>   	if (area_alias == MAP_FAILED) {
>> -		munmap(*alloc_area, bytes);
>> -		*alloc_area = NULL;
>> -		return -errno;
>> +		/* If fixed mapping failed, try anywhere */
>> +		area_alias = mmap(NULL, bytes, PROT_READ | PROT_WRITE,
>> +				  MAP_SHARED, mem_fd, offset);
>> +		if (area_alias == MAP_FAILED) {
>> +			munmap(*alloc_area, bytes);
>> +			*alloc_area = NULL;
>> +			close(mem_fd);
>> +			return -errno;
>> +		}
>>   	}
>> -	if (area_alias != p_alias)
>> -		err("mmap of anonymous memory failed at %p", p_alias);
>>   
>> +	/* Store the alias mapping for later use */
>>   	if (is_src)
>>   		area_src_alias = area_alias;
>>   	else
>> -- 
>> 2.43.0
>>
>
View attachment "v2-0001-mm-selftest-Fix-race-condition-in-userfaultfd-dyn.patch" of type "text/x-patch" (8952 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ