linux-kernel - Re: [PATCH v2 1/1] mm: skip mlocked THPs that are underused early in deferred_split

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2c3d74d7-2e18-4bcd-bbd2-b1b4a65862c3@linux.dev>
Date: Mon, 8 Sep 2025 21:00:39 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Kiryl Shutsemau <kirill@...temov.name>,
 David Hildenbrand <david@...hat.com>
Cc: akpm@...ux-foundation.org, Liam.Howlett@...cle.com, baohua@...nel.org,
 baolin.wang@...ux.alibaba.com, dev.jain@....com,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 lorenzo.stoakes@...cle.com, npache@...hat.com, ryan.roberts@....com,
 usamaarif642@...il.com, ziy@...dia.com
Subject: Re: [PATCH v2 1/1] mm: skip mlocked THPs that are underused early in
 deferred_split_scan()



On 2025/9/8 20:45, Kiryl Shutsemau wrote:
> On Mon, Sep 08, 2025 at 02:04:05PM +0200, David Hildenbrand wrote:
>> On 08.09.25 13:44, Kiryl Shutsemau wrote:
>>> On Mon, Sep 08, 2025 at 01:32:05PM +0200, David Hildenbrand wrote:
>>>> On 08.09.25 12:38, Kiryl Shutsemau wrote:
>>>>> On Mon, Sep 08, 2025 at 05:07:41PM +0800, Lance Yang wrote:
>>>>>> From: Lance Yang <lance.yang@...ux.dev>
>>>>>>
>>>>>> When we stumble over a fully-mapped mlocked THP in the deferred shrinker,
>>>>>> it does not make sense to try to detect whether it is underused, because
>>>>>> try_to_map_unused_to_zeropage(), called while splitting the folio, will not
>>>>>> actually replace any zeroed pages by the shared zeropage.
>>>>>
>>>>> It makes me think, does KSM follows the same logic as
>>>>> try_to_map_unused_to_zeropage()?
>>>>>
>>>>> I cannot immediately find what prevents KSM from replacing zeroed mlocked
>>>>> folio with ZERO_PAGE().
>>>>>
>>>>> Hm?
>>>>
>>>> I assume if you're using mlock and at the same time enable KSM for a
>>>> process/VMA, you're doing something wrong.
>>>>
>>>> In contrast, THP is supposed to be transparent (yeah, I know ...).
>>>
>>> Yeah, I guess it is user error.
>>>
>>> Maybe we should make ksm_compatible() return false for VM_LOCKED?
>>> KSM breaks mlock() contract.
>>
>> I was thinking the same and falsely remembered that we would already be
>> checking for that.
>>
>>>
>>> But it can be risky if someone already relies on this broken behaviour.
>>
>> Could be.
>>
>> Staring at QEMU, we have the following parameters:
>>
>> 	mem-merge=on|off
>>
>>      	Enables or disables memory merge support. This feature, when 	
>>          supported by the host, de-duplicates identical memory pages
>>          among VMs instances (enabled by default).
>>
>> And
>>
>> 	-overcommit mem-lock=on|off|on-fault
>>
>> 	"Run qemu with hints about host resource overcommit. The default
>> 	 is to assume that host overcommits all resources."
>>
>>
>> Now, I would assume that anybody who sets "-overcommit mem-lock=on" either
>>
>> (a) Has KSM disabled on that machine.
>>
>> (b) Sets mem-merge=off
>>
>> as well. But QEMU would allow for configuring it.


That's really interesting ;)

I guess it's likely that people are already relying on that behavior,
even if it's flawed.

> 
> ksm_madvise(MADV_MERGEABLE) succeeds on !vma_ksm_compatible(), so it
> wouldn't be functional breakage, but may result in unexpected increase
> of memory consumption.

Hmm... it's hard to argue that nothing is broken.

An application losing the memory savings from KSM might not fit in
memory at all and could be taken down by the OOM killer ;p

> 
>> Interestingly, mm_populate()->populate_vma_page_range() wants to break COW.
>> [*]
>>
>> But if the app later calls fork(), we still allow for cow-sharing pages with
>> the child. (another case of "don't do it", like KSM I guess)
> 
> CoW has bunch of these "don't do it". :P
> 
>> [*] it doesn't do it for mappings that start out R/O. I think we might end
>> up with sharedzero pages in that case, but not sure if worth fixing.
>>
>> -- 
>> Cheers
>>
>> David / dhildenb
>>
>