[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <618798d5-71b2-43d6-8f5c-78d911c5dd43@redhat.com>
Date: Wed, 8 Jan 2025 17:46:37 +0100
From: David Hildenbrand <david@...hat.com>
To: Thomas Weißschuh <thomas.weissschuh@...utronix.de>
Cc: Dev Jain <dev.jain@....com>, Andrew Morton <akpm@...ux-foundation.org>,
Shuah Khan <shuah@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
linux-mm@...ck.org, linux-kselftest@...r.kernel.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org,
Ryan Roberts <ryan.roberts@....com>
Subject: Re: [PATCH 1/3] selftests/mm: virtual_address_range: Fix error when
CommitLimit < 1GiB
On 08.01.25 17:13, Thomas Weißschuh wrote:
> On Wed, Jan 08, 2025 at 02:36:57PM +0100, David Hildenbrand wrote:
>> On 08.01.25 09:05, Thomas Weißschuh wrote:
>>> On Wed, Jan 08, 2025 at 11:46:19AM +0530, Dev Jain wrote:
>>>>
>>>> On 07/01/25 8:44 pm, Thomas Weißschuh wrote:
>>>>> If not enough physical memory is available the kernel may fail mmap();
>>>>> see __vm_enough_memory() and vm_commit_limit().
>>>>> In that case the logic in validate_complete_va_space() does not make
>>>>> sense and will even incorrectly fail.
>>>>> Instead skip the test if no mmap() succeeded.
>>>>>
>>>>> Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()")
>>>>> Cc: stable@...r.kernel.org
>>
>> CC stable on tests is ... odd.
>
> I thought it was fairly common, but it isn't.
> Will drop it.
As it's not really a "kernel BUG", it's rather uncommon.
>>
>> Note that with MAP_NORESRVE, most setups we care about will allow mapping as
>> much as you want, but on access OOM will fire.
>
> Thanks for the hint.
>
>> So one could require that /proc/sys/vm/overcommit_memory is setup properly
>> and use MAP_NORESRVE.
>
> Isn't the check for lchunks == 0 essentially exactly this?
I assume paired with MAP_NORESERVE?
Maybe, but it could be better to have something that says "if
overcommit_memory is not setup properly I will SKIP this test", but
otherwise I expect this to work and will FAIL if it doesn't".
Or would you expect to run into lchunks == 0 even if overcommit_memory
is setup properly and MAP_NORESERVE is used? (very very low memory that
we cannot even create all the VMAs?)
>
>> Reading from anonymous memory will populate the shared zeropage. To mitigate
>> OOM from "too many page tables", one could simply unmap the pieces as they
>> are verified (or MAP_FIXED over them, to free page tables).
>
> The code has to figure out if a verified region was created by mmap(),
> otherwise an munmap() could crash the process.
> As the entries from /proc/self/maps may have been merged and (I assume)
Yes, and partial unmap (in chunk granularity?) would split them again.
> the ordering of mappings is not guaranteed, some bespoke logic to establish
> the link will be needed.
My thinking was that you simply process one /proc/self/maps entry in
some chunks. After processing a chunk, you munmap() it.
So you would process + munmap in chunks.
>
> Is it fine to rely on CONFIG_ANON_VMA_NAME?
> That would make it much easier to implement.
Can you elaborate how you would do it?
>
> Using MAP_NORESERVE and eager munmap()s, the testcase works nicely even
> in very low physical memory conditions.
Cool.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists