[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05edee1e-04f1-4f19-816f-db03c182a201@redhat.com>
Date: Wed, 8 Jan 2025 14:36:57 +0100
From: David Hildenbrand <david@...hat.com>
To: Thomas Weißschuh <thomas.weissschuh@...utronix.de>,
Dev Jain <dev.jain@....com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Shuah Khan <shuah@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, linux-mm@...ck.org,
linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org, Ryan Roberts <ryan.roberts@....com>
Subject: Re: [PATCH 1/3] selftests/mm: virtual_address_range: Fix error when
CommitLimit < 1GiB
On 08.01.25 09:05, Thomas Weißschuh wrote:
> On Wed, Jan 08, 2025 at 11:46:19AM +0530, Dev Jain wrote:
>>
>> On 07/01/25 8:44 pm, Thomas Weißschuh wrote:
>>> If not enough physical memory is available the kernel may fail mmap();
>>> see __vm_enough_memory() and vm_commit_limit().
>>> In that case the logic in validate_complete_va_space() does not make
>>> sense and will even incorrectly fail.
>>> Instead skip the test if no mmap() succeeded.
>>>
>>> Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()")
>>> Cc: stable@...r.kernel.org
CC stable on tests is ... odd.
>>> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@...utronix.de>
>>>
>>> ---
>>> The logic in __vm_enough_memory() seems weird.
>>> It describes itself as "Check that a process has enough memory to
>>> allocate a new virtual mapping", however it never checks the current
>>> memory usage of the process.
>>> So it only disallows large mappings. But many small mappings taking the
>>> same amount of memory are allowed; and then even automatically merged
>>> into one big mapping.
>>> ---
>>> tools/testing/selftests/mm/virtual_address_range.c | 6 ++++++
>>> 1 file changed, 6 insertions(+)
>>>
>>> diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c
>>> index 2a2b69e91950a37999f606847c9c8328d79890c2..d7bf8094d8bcd4bc96e2db4dc3fcb41968def859 100644
>>> --- a/tools/testing/selftests/mm/virtual_address_range.c
>>> +++ b/tools/testing/selftests/mm/virtual_address_range.c
>>> @@ -178,6 +178,12 @@ int main(int argc, char *argv[])
>>> validate_addr(ptr[i], 0);
>>> }
>>> lchunks = i;
>>> +
>>> + if (!lchunks) {
>>> + ksft_test_result_skip("Not enough memory for a single chunk\n");
>>> + ksft_finished();
>>> + }
>>> +
>>> hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *));
>>> if (hptr == NULL) {
>>> ksft_test_result_skip("Memory constraint not fulfilled\n");
>>>
>>
>> I do not know about __vm_enough_memory(), but I am going by your description:
>> You say that the kernel may fail mmap() when enough physical memory is not
>> there, but it may happen that we have already done 100 mmap()'s, and then
>> the kernel fails mmap(), so if (!lchunks) won't be able to handle this case.
>> Basically, lchunks == 0 is not a complete indicator of kernel failing mmap().
>
> __vm_enough_memory() only checks the size of each single mmap() on its
> own. It does not actually check the current memory or address space
> usage of the process.
> This seems a bit weird, as indicated in my after-the-fold explanation.
>
>> The basic assumption of the test is that any process should be able to exhaust
>> its virtual address space, and running the test under memory pressure and the
>> kernel violating this behaviour defeats the point of the test I think?
>
> The assumption is correct, as soon as one mapping succeeds the others
> will also succeed, until the actual address space is exhausted.
>
> Looking at it again, __vm_enough_memory() is only called for writable
> mappings, so it would be possible to use only readable mappings in the
> test. The test will still fail with OOM, as the many PTEs need more than
> 1GiB of physical memory anyways, but at least that produces a usable
> error message.
> However I'm not sure if this would violate other test assumptions.
>
Note that with MAP_NORESRVE, most setups we care about will allow
mapping as much as you want, but on access OOM will fire.
So one could require that /proc/sys/vm/overcommit_memory is setup
properly and use MAP_NORESRVE.
Reading from anonymous memory will populate the shared zeropage. To
mitigate OOM from "too many page tables", one could simply unmap the
pieces as they are verified (or MAP_FIXED over them, to free page tables).
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists