lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <111d2351-3fb7-4011-af07-78b40874d956@linux.ibm.com>
Date: Thu, 7 Aug 2025 14:56:28 +0530
From: Donet Tom <donettom@...ux.ibm.com>
To: Wei Yang <richard.weiyang@...il.com>
Cc: Aboorva Devarajan <aboorvad@...ux.ibm.com>, akpm@...ux-foundation.org,
        Liam.Howlett@...cle.com, lorenzo.stoakes@...cle.com, shuah@...nel.org,
        pfalcato@...e.de, david@...hat.com, ziy@...dia.com,
        baolin.wang@...ux.alibaba.com, npache@...hat.com, ryan.roberts@....com,
        dev.jain@....com, baohua@...nel.org, linux-mm@...ck.org,
        linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org,
        ritesh.list@...il.com
Subject: Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures


On 8/6/25 8:24 PM, Wei Yang wrote:
> On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote:
> [...]
>>> Child process inherit the ksm_merging_pages from parent, which is reasonable
>>> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages
>>> for parent and leave ksm_merging_pages in child process unchanged.
>>>
>>> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs
>>> interface. I expect it applies to both parent and child.
>> I am not very familiar with the KSM code, but from what I understand:
>>
>> The ksm_merging_pages counter is maintained per mm_struct. When
>> we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the
>> counters are updated for all mm_structs present in the ksm_mm_slot list.
>>
>> A mm_struct gets added to this list  when MADV_MERGEABLE is called.
>> In the case of the child process, since MADV_MERGEABLE has not been
>> invoked yet, its mm_struct is not part of the list. As a result,
>> its ksm_merging_pages counter is not reset.
>>
> Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma
> I don't see it would be dropped during fork. Maybe missed.
>
>>>> value remained unchanged. That’s why get_my_merging_page() in the child was
>>>> returning a non-zero value.
>>>>
>>> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return
>>> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge()
>>> couldn't reset the value, but a ksm_unmerge() in parent could.
>>>
>>>> Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and
>>>> that
>>>> resolved the problem. Later, I decided it would be cleaner to move the
>>>> ksm_unmerge() call to the test cleanup phase.
>>>>
>>> Also all the tests before test_prctl_fork(), except test_prctl(), calls
>>>
>>>     ksft_test_result(!range_maps_duplicates());
>>>
>>> If the previous tests succeed, it means there is no duplicate pages. This
>>> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests
>>> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test
>>> proves it.)
>>
>> If I understand correctly, all the tests are calling MADV_UNMERGEABLE,
>> which internally calls break_ksm() in the kernel. This function replaces the
>> KSM page with an exclusive anonymous page. However, the
>> ksm_merging_pages counters are not updated at this point.
>>
>> The function range_maps_duplicates(map, size) checks whether the pages
>> have been unmerged. Since break_ksm() does perform the unmerge, this
>> function returns false, and the test passes.
>>
>> The ksm_merging_pages update happens later via the ksm_scan_thread().
>> That’s why we observe that ksm_merging_pages values are not reset
>> immediately after the test finishes.
>>
> Not familiar with ksm internal. But the ksm_merging_pages counter still has
> non-zero value when all merged pages are unmerged makes me feel odd.
>
>> If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that
>> the ksm_merging_pages values are reset after the sleep.
>>
>> Once the test completes successfully, we can call ksm_unmerge(), which
>> will immediately reset the ksm_merging_pages value. This way, in the fork
>> test, the child process will also see the correct value.
>>> So which part of the story I missed?
>>>
>> So, during the cleanup phase after a successful test, we can call
>> ksm_unmerge() to reset the counter. Do you see any issue with
>> this approach?
>>
> It looks there is no issue with an extra ksm_unmerge().
>
> But one more question. Why an extra ksm_unmerge() could help.
>
> Here is what we have during test:
>
>
>    test_prot_none()
>        !range_maps_duplicates()
>        ksm_unmerge()                  1) <--- newly add
>    test_prctl_fork()
>        >--- in child
>        __mmap_and_merge_range()
>            ksm_unmerge()              2) <--- already have
>
> As you mentioned above ksm_unmerge() would immediately reset
> ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages
> non-zero? And the one at 1) could help.


 From the debugging, what I understood is:

When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the
mm_struct of the process gets added to the ksm_mm_slot list. As a
result, both the parent and child processes’ mm_struct structures
will be present in ksm_mm_slot.

When KSM merges the pages, it creates a ksm_rmap_item for each page,
and the ksm_merging_pages counter is incremented accordingly.

Since the parent process did the merge, its mm_struct is present in
ksm_mm_slot, and ksm_rmap_item entries are created for all the merged
pages.

When a process is forked, the child’s mm_struct is also added to
ksm_mm_slot, and it inherits the ksm_merging_pages count. However,
no ksm_rmap_item entries are created for the child process because it
did not do any merge.

When ksm_unmerge() is called, it iterates over all processes in
ksm_mm_slot. In our case, both the parent and child are present. It
first processes the parent, which has ksm_rmap_item entries, so it
unmerges the pages and resets the ksm_merging_pages counter.

For the child, since it did not perform any actual merging, it does not
have any ksm_rmap_item entries. Therefore, there are no pages to unmerge,
and the counter remains unchanged.

So, only processes that performed KSM merging will have their counters
updated during ksm_unmerge(). The child process, having not initiated any
merging, retains the inherited counter value without any update.

So from a testing point of view, I think it is better to reset the
counters as part of the cleanup code to ensure that the next tests do
not get incorrect values.

The question I have is: is it correct to keep the inherited 
|ksm_merging_page|
value in the child or Should we reset it to 0 during |ksm_fork()|?

>
> Or there is still some timing issue like sleep(1) you did?
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ