lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4585175b-303c-476e-ad3f-09838383364e@arm.com>
Date: Mon, 23 Jun 2025 10:25:04 +0530
From: Dev Jain <dev.jain@....com>
To: Donet Tom <donettom@...ux.ibm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Aboorva Devarajan <aboorvad@...ux.ibm.com>, akpm@...ux-foundation.org,
 Liam.Howlett@...cle.com, shuah@...nel.org, pfalcato@...e.de,
 david@...hat.com, ziy@...dia.com, baolin.wang@...ux.alibaba.com,
 npache@...hat.com, ryan.roberts@....com, baohua@...nel.org,
 linux-mm@...ck.org, linux-kselftest@...r.kernel.org,
 linux-kernel@...r.kernel.org, ritesh.list@...il.com
Subject: Re: [PATCH 1/6] mm/selftests: Fix virtual_address_range test issues.


On 23/06/25 10:23 am, Dev Jain wrote:
>
> On 21/06/25 11:25 pm, Donet Tom wrote:
>> On Fri, Jun 20, 2025 at 08:15:25PM +0530, Dev Jain wrote:
>>> On 19/06/25 1:53 pm, Donet Tom wrote:
>>>> On Wed, Jun 18, 2025 at 08:13:54PM +0530, Dev Jain wrote:
>>>>> On 18/06/25 8:05 pm, Lorenzo Stoakes wrote:
>>>>>> On Wed, Jun 18, 2025 at 07:47:18PM +0530, Dev Jain wrote:
>>>>>>> On 18/06/25 7:37 pm, Lorenzo Stoakes wrote:
>>>>>>>> On Wed, Jun 18, 2025 at 07:28:16PM +0530, Dev Jain wrote:
>>>>>>>>> On 18/06/25 5:27 pm, Lorenzo Stoakes wrote:
>>>>>>>>>> On Wed, Jun 18, 2025 at 05:15:50PM +0530, Dev Jain wrote:
>>>>>>>>>> Are you accounting for sys.max_map_count? If not, then you'll 
>>>>>>>>>> be hitting that
>>>>>>>>>> first.
>>>>>>>>> run_vmtests.sh will run the test in overcommit mode so that 
>>>>>>>>> won't be an issue.
>>>>>>>> Umm, what? You mean overcommit all mode, and that has no 
>>>>>>>> bearing on the max
>>>>>>>> mapping count check.
>>>>>>>>
>>>>>>>> In do_mmap():
>>>>>>>>
>>>>>>>>     /* Too many mappings? */
>>>>>>>>     if (mm->map_count > sysctl_max_map_count)
>>>>>>>>         return -ENOMEM;
>>>>>>>>
>>>>>>>>
>>>>>>>> As well as numerous other checks in mm/vma.c.
>>>>>>> Ah sorry, didn't look at the code properly just assumed that 
>>>>>>> overcommit_always meant overriding
>>>>>>> this.
>>>>>> No problem! It's hard to be aware of everything in mm :)
>>>>>>
>>>>>>>> I'm not sure why an overcommit toggle is even necessary when 
>>>>>>>> you could use
>>>>>>>> MAP_NORESERVE or simply map PROT_NONE to avoid the 
>>>>>>>> OVERCOMMIT_GUESS limits?
>>>>>>>>
>>>>>>>> I'm pretty confused as to what this test is really achieving 
>>>>>>>> honestly. This
>>>>>>>> isn't a useful way of asserting mmap() behaviour as far as I 
>>>>>>>> can tell.
>>>>>>> Well, seems like a useful way to me at least : ) Not sure if you 
>>>>>>> are in the mood
>>>>>>> to discuss that but if you'd like me to explain from start to 
>>>>>>> end what the test
>>>>>>> is doing, I can do that : )
>>>>>>>
>>>>>> I just don't have time right now, I guess I'll have to come back 
>>>>>> to it
>>>>>> later... it's not the end of the world for it to be iffy in my 
>>>>>> view as long as
>>>>>> it passes, but it might just not be of great value.
>>>>>>
>>>>>> Philosophically I'd rather we didn't assert internal 
>>>>>> implementation details like
>>>>>> where we place mappings in userland memory. At no point do we 
>>>>>> promise to not
>>>>>> leave larger gaps if we feel like it :)
>>>>> You have a fair point. Anyhow a debate for another day.
>>>>>
>>>>>> I'm guessing, reading more, the _real_ test here is some 
>>>>>> mathematical assertion
>>>>>> about layout from HIGH_ADDR_SHIFT -> end of address space when 
>>>>>> using hints.
>>>>>>
>>>>>> But again I'm not sure that achieves much and again also is 
>>>>>> asserting internal
>>>>>> implementation details.
>>>>>>
>>>>>> Correct behaviour of this kind of thing probably better belongs 
>>>>>> to tests in the
>>>>>> userland VMA testing I'd say.
>>>>>>
>>>>>> Sorry I don't mean to do down work you've done before, just 
>>>>>> giving an honest
>>>>>> technical appraisal!
>>>>> Nah, it will be rather hilarious to see it all go down the drain xD
>>>>>
>>>>>> Anyway don't let this block work to fix the test if it's failing. 
>>>>>> We can revisit
>>>>>> this later.
>>>>> Sure. @Aboorva and Donet, I still believe that the correct 
>>>>> approach is to elide
>>>>> the gap check at the crossing boundary. What do you think?
>>>>>
>>>> One problem I am seeing with this approach is that, since the hint 
>>>> address
>>>> is generated randomly, the VMAs are also being created at randomly 
>>>> based on
>>>> the hint address.So, for the VMAs created at high addresses, we 
>>>> cannot guarantee
>>>> that the gaps between them will be aligned to MAP_CHUNK_SIZE.
>>>>
>>>> High address VMAs
>>>> -----------------
>>>> 1000000000000-1000040000000 r--p 00000000 00:00 0
>>>> 2000000000000-2000040000000 r--p 00000000 00:00 0
>>>> 4000000000000-4000040000000 r--p 00000000 00:00 0
>>>> 8000000000000-8000040000000 r--p 00000000 00:00 0
>>>> e80009d260000-fffff9d260000 r--p 00000000 00:00 0
>>>>
>>>> I have a different approach to solve this issue.
>>> It is really weird that such a large amount of VA space
>>> is left between the two VMAs yet mmap is failing.
>>>
>>>
>>>
>>> Can you please do the following:
>>> set /proc/sys/vm/max_map_count to the highest value possible.
>>> If running without run_vmtests.sh, set 
>>> /proc/sys/vm/overcommit_memory to 1.
>>> In validate_complete_va_space:
>>>
>>> if (start_addr >= HIGH_ADDR_MARK && found == false) {
>>>     found = true;
>>>     continue;
>>> }
>>
>> Thanks Dev for the suggestion. I set max_map_count and set overcommit
>> memory to 1, added this code change as well, and then tried. Still, the
>> test is failing
>>
>>> where found is initialized to false. This will skip the check
>>> for the boundary.
>>>
>>> After this can you tell whether the test is still failing.
>>>
>>> Also can you give me the complete output of proc/pid/maps
>>> after putting a sleep at the end of the test.
>>>
>>
>> on powerpc support DEFAULT_MAP_WINDOW is 128TB and with
>> total address space size is 4PB With hint it can map upto
>> 4PB. Since the hint addres is random in this test random hing VMAs
>> are getting created. IIUC this is expected only.
>>
>>
>> 10000000-10010000 r-xp 00000000 fd:05 134226638 
>> /home/donet/linux/tools/testing/selftests/mm/virtual_address_range
>> 10010000-10020000 r--p 00000000 fd:05 134226638 
>> /home/donet/linux/tools/testing/selftests/mm/virtual_address_range
>> 10020000-10030000 rw-p 00010000 fd:05 134226638 
>> /home/donet/linux/tools/testing/selftests/mm/virtual_address_range
>> 30000000-10030000000 r--p 00000000 00:00 
>> 0                               [anon:virtual_address_range]
>> 10030770000-100307a0000 rw-p 00000000 00:00 
>> 0                            [heap]
>> 1004f000000-7fff8f000000 r--p 00000000 00:00 
>> 0                           [anon:virtual_address_range]
>> 7fff8faf0000-7fff8fe00000 rw-p 00000000 00:00 0
>> 7fff8fe00000-7fff90030000 r-xp 00000000 fd:00 
>> 792355                     /usr/lib64/libc.so.6
>> 7fff90030000-7fff90040000 r--p 00230000 fd:00 
>> 792355                     /usr/lib64/libc.so.6
>> 7fff90040000-7fff90050000 rw-p 00240000 fd:00 
>> 792355                     /usr/lib64/libc.so.6
>> 7fff90050000-7fff90130000 r-xp 00000000 fd:00 
>> 792358                     /usr/lib64/libm.so.6
>> 7fff90130000-7fff90140000 r--p 000d0000 fd:00 
>> 792358                     /usr/lib64/libm.so.6
>> 7fff90140000-7fff90150000 rw-p 000e0000 fd:00 
>> 792358                     /usr/lib64/libm.so.6
>> 7fff90160000-7fff901a0000 r--p 00000000 00:00 
>> 0                          [vvar]
>> 7fff901a0000-7fff901b0000 r-xp 00000000 00:00 
>> 0                          [vdso]
>> 7fff901b0000-7fff90200000 r-xp 00000000 fd:00 
>> 792351                     /usr/lib64/ld64.so.2
>> 7fff90200000-7fff90210000 r--p 00040000 fd:00 
>> 792351                     /usr/lib64/ld64.so.2
>> 7fff90210000-7fff90220000 rw-p 00050000 fd:00 
>> 792351                     /usr/lib64/ld64.so.2
>> 7fffc9770000-7fffc9880000 rw-p 00000000 00:00 
>> 0                          [stack]
>> 1000000000000-1000040000000 r--p 00000000 00:00 
>> 0                        [anon:virtual_address_range]
>> 2000000000000-2000040000000 r--p 00000000 00:00 
>> 0                        [anon:virtual_address_range]
>> 4000000000000-4000040000000 r--p 00000000 00:00 
>> 0                        [anon:virtual_address_range]
>> 8000000000000-8000040000000 r--p 00000000 00:00 
>> 0                        [anon:virtual_address_range]
>> eb95410220000-fffff90220000 r--p 00000000 00:00 
>> 0                        [anon:virtual_address_range]
>>
>>
>>
>>
>> If I give the hint address serially from 128TB then the address
>> space is contigous and gap is also MAP_SIZE, the test is passing.
>>
>> 10000000-10010000 r-xp 00000000 fd:05 134226638 
>> /home/donet/linux/tools/testing/selftests/mm/virtual_address_range
>> 10010000-10020000 r--p 00000000 fd:05 134226638 
>> /home/donet/linux/tools/testing/selftests/mm/virtual_address_range
>> 10020000-10030000 rw-p 00010000 fd:05 134226638 
>> /home/donet/linux/tools/testing/selftests/mm/virtual_address_range
>> 33000000-10033000000 r--p 00000000 00:00 
>> 0                               [anon:virtual_address_range]
>> 10033380000-100333b0000 rw-p 00000000 00:00 
>> 0                            [heap]
>> 1006f0f0000-10071000000 rw-p 00000000 00:00 0
>> 10071000000-7fffb1000000 r--p 00000000 00:00 
>> 0                           [anon:virtual_address_range]
>> 7fffb15d0000-7fffb1800000 r-xp 00000000 fd:00 
>> 792355                     /usr/lib64/libc.so.6
>> 7fffb1800000-7fffb1810000 r--p 00230000 fd:00 
>> 792355                     /usr/lib64/libc.so.6
>> 7fffb1810000-7fffb1820000 rw-p 00240000 fd:00 
>> 792355                     /usr/lib64/libc.so.6
>> 7fffb1820000-7fffb1900000 r-xp 00000000 fd:00 
>> 792358                     /usr/lib64/libm.so.6
>> 7fffb1900000-7fffb1910000 r--p 000d0000 fd:00 
>> 792358                     /usr/lib64/libm.so.6
>> 7fffb1910000-7fffb1920000 rw-p 000e0000 fd:00 
>> 792358                     /usr/lib64/libm.so.6
>> 7fffb1930000-7fffb1970000 r--p 00000000 00:00 
>> 0                          [vvar]
>> 7fffb1970000-7fffb1980000 r-xp 00000000 00:00 
>> 0                          [vdso]
>> 7fffb1980000-7fffb19d0000 r-xp 00000000 fd:00 
>> 792351                     /usr/lib64/ld64.so.2
>> 7fffb19d0000-7fffb19e0000 r--p 00040000 fd:00 
>> 792351                     /usr/lib64/ld64.so.2
>> 7fffb19e0000-7fffb19f0000 rw-p 00050000 fd:00 
>> 792351                     /usr/lib64/ld64.so.2
>> 7fffc5470000-7fffc5580000 rw-p 00000000 00:00 
>> 0                          [stack]
>> 800000000000-2aab000000000 r--p 00000000 00:00 
>> 0                         [anon:virtual_address_range]
>>
>>
>
> Thank you for this output. I can't wrap my head around why this 
> behaviour changes
> when you generate the hint sequentially. The mmap() syscall is 
> supposed to do the
> following (irrespective of high VA space or not) - if the allocation 
> at the hint
> addr succeeds, then all is well, otherwise, do a top-down search for a 
> large
> enough gap. I am not aware of the nuances in powerpc but I really am 
> suspecting
> a bug in powerpc mmap code. Can you try to do some tracing - which 
> function
> eventually fails to find the empty gap?
>
> Through my limited code tracing - we should end up in 
> slice_find_area_topdown,
> then we ask the generic code to find the gap using vm_unmapped_area. So I
> suspect something is happening between this, probably 
> slice_scan_available().


Also, is the memory system you are testing on radix or hash?


>
>>
>>>>   From 0 to 128TB, we map memory directly without using any hint. 
>>>> For the range above
>>>> 256TB up to 512TB, we perform the mapping using hint addresses. In 
>>>> the current test,
>>>> we use random hint addresses, but I have modified it to generate 
>>>> hint addresses linearly
>>>> starting from 128TB.
>>>>
>>>> With this change:
>>>>
>>>> The 0–128TB range is mapped without hints and verified accordingly.
>>>>
>>>> The 128TB–512TB range is mapped using linear hint addresses and 
>>>> then verified.
>>>>
>>>> Below are the VMAs obtained with this approach:
>>>>
>>>> 10000000-10010000 r-xp 00000000 fd:05 135019531
>>>> 10010000-10020000 r--p 00000000 fd:05 135019531
>>>> 10020000-10030000 rw-p 00010000 fd:05 135019531
>>>> 20000000-10020000000 r--p 00000000 00:00 0
>>>> 10020800000-10020830000 rw-p 00000000 00:00 0
>>>> 1004bcf0000-1004c000000 rw-p 00000000 00:00 0
>>>> 1004c000000-7fff8c000000 r--p 00000000 00:00 0
>>>> 7fff8c130000-7fff8c360000 r-xp 00000000 fd:00 792355
>>>> 7fff8c360000-7fff8c370000 r--p 00230000 fd:00 792355
>>>> 7fff8c370000-7fff8c380000 rw-p 00240000 fd:00 792355
>>>> 7fff8c380000-7fff8c460000 r-xp 00000000 fd:00 792358
>>>> 7fff8c460000-7fff8c470000 r--p 000d0000 fd:00 792358
>>>> 7fff8c470000-7fff8c480000 rw-p 000e0000 fd:00 792358
>>>> 7fff8c490000-7fff8c4d0000 r--p 00000000 00:00 0
>>>> 7fff8c4d0000-7fff8c4e0000 r-xp 00000000 00:00 0
>>>> 7fff8c4e0000-7fff8c530000 r-xp 00000000 fd:00 792351
>>>> 7fff8c530000-7fff8c540000 r--p 00040000 fd:00 792351
>>>> 7fff8c540000-7fff8c550000 rw-p 00050000 fd:00 792351
>>>> 7fff8d000000-7fffcd000000 r--p 00000000 00:00 0
>>>> 7fffe9c80000-7fffe9d90000 rw-p 00000000 00:00 0
>>>> 800000000000-2000000000000 r--p 00000000 00:00 0    -> High Address 
>>>> (128TB to 512TB)
>>>>
>>>> diff --git a/tools/testing/selftests/mm/virtual_address_range.c 
>>>> b/tools/testing/selftests/mm/virtual_address_range.c
>>>> index 4c4c35eac15e..0be008cba4b0 100644
>>>> --- a/tools/testing/selftests/mm/virtual_address_range.c
>>>> +++ b/tools/testing/selftests/mm/virtual_address_range.c
>>>> @@ -56,21 +56,21 @@
>>>>    #ifdef __aarch64__
>>>>    #define HIGH_ADDR_MARK  ADDR_MARK_256TB
>>>> -#define HIGH_ADDR_SHIFT 49
>>>> +#define HIGH_ADDR_SHIFT 48
>>>>    #define NR_CHUNKS_LOW   NR_CHUNKS_256TB
>>>>    #define NR_CHUNKS_HIGH  NR_CHUNKS_3840TB
>>>>    #else
>>>>    #define HIGH_ADDR_MARK  ADDR_MARK_128TB
>>>> -#define HIGH_ADDR_SHIFT 48
>>>> +#define HIGH_ADDR_SHIFT 47
>>>>    #define NR_CHUNKS_LOW   NR_CHUNKS_128TB
>>>>    #define NR_CHUNKS_HIGH  NR_CHUNKS_384TB
>>>>    #endif
>>>> -static char *hint_addr(void)
>>>> +static char *hint_addr(int hint)
>>>>    {
>>>> -       int bits = HIGH_ADDR_SHIFT + rand() % (63 - HIGH_ADDR_SHIFT);
>>>> +       unsigned long addr = ((1UL << HIGH_ADDR_SHIFT) + (hint * 
>>>> MAP_CHUNK_SIZE));
>>>> -       return (char *) (1UL << bits);
>>>> +       return (char *) (addr);
>>>>    }
>>>>    static void validate_addr(char *ptr, int high_addr)
>>>> @@ -217,7 +217,7 @@ int main(int argc, char *argv[])
>>>>           }
>>>>           for (i = 0; i < NR_CHUNKS_HIGH; i++) {
>>>> -               hint = hint_addr();
>>>> +               hint = hint_addr(i);
>>>>                   hptr[i] = mmap(hint, MAP_CHUNK_SIZE, PROT_READ,
>>>>                                  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>>>>
>>>>
>>>>
>>>> Can we fix it this way?
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ