lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2c3d74d7-2e18-4bcd-bbd2-b1b4a65862c3@linux.dev>
Date: Mon, 8 Sep 2025 21:00:39 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Kiryl Shutsemau <kirill@...temov.name>,
 David Hildenbrand <david@...hat.com>
Cc: akpm@...ux-foundation.org, Liam.Howlett@...cle.com, baohua@...nel.org,
 baolin.wang@...ux.alibaba.com, dev.jain@....com,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 lorenzo.stoakes@...cle.com, npache@...hat.com, ryan.roberts@....com,
 usamaarif642@...il.com, ziy@...dia.com
Subject: Re: [PATCH v2 1/1] mm: skip mlocked THPs that are underused early in
 deferred_split_scan()



On 2025/9/8 20:45, Kiryl Shutsemau wrote:
> On Mon, Sep 08, 2025 at 02:04:05PM +0200, David Hildenbrand wrote:
>> On 08.09.25 13:44, Kiryl Shutsemau wrote:
>>> On Mon, Sep 08, 2025 at 01:32:05PM +0200, David Hildenbrand wrote:
>>>> On 08.09.25 12:38, Kiryl Shutsemau wrote:
>>>>> On Mon, Sep 08, 2025 at 05:07:41PM +0800, Lance Yang wrote:
>>>>>> From: Lance Yang <lance.yang@...ux.dev>
>>>>>>
>>>>>> When we stumble over a fully-mapped mlocked THP in the deferred shrinker,
>>>>>> it does not make sense to try to detect whether it is underused, because
>>>>>> try_to_map_unused_to_zeropage(), called while splitting the folio, will not
>>>>>> actually replace any zeroed pages by the shared zeropage.
>>>>>
>>>>> It makes me think, does KSM follows the same logic as
>>>>> try_to_map_unused_to_zeropage()?
>>>>>
>>>>> I cannot immediately find what prevents KSM from replacing zeroed mlocked
>>>>> folio with ZERO_PAGE().
>>>>>
>>>>> Hm?
>>>>
>>>> I assume if you're using mlock and at the same time enable KSM for a
>>>> process/VMA, you're doing something wrong.
>>>>
>>>> In contrast, THP is supposed to be transparent (yeah, I know ...).
>>>
>>> Yeah, I guess it is user error.
>>>
>>> Maybe we should make ksm_compatible() return false for VM_LOCKED?
>>> KSM breaks mlock() contract.
>>
>> I was thinking the same and falsely remembered that we would already be
>> checking for that.
>>
>>>
>>> But it can be risky if someone already relies on this broken behaviour.
>>
>> Could be.
>>
>> Staring at QEMU, we have the following parameters:
>>
>> 	mem-merge=on|off
>>
>>      	Enables or disables memory merge support. This feature, when 	
>>          supported by the host, de-duplicates identical memory pages
>>          among VMs instances (enabled by default).
>>
>> And
>>
>> 	-overcommit mem-lock=on|off|on-fault
>>
>> 	"Run qemu with hints about host resource overcommit. The default
>> 	 is to assume that host overcommits all resources."
>>
>>
>> Now, I would assume that anybody who sets "-overcommit mem-lock=on" either
>>
>> (a) Has KSM disabled on that machine.
>>
>> (b) Sets mem-merge=off
>>
>> as well. But QEMU would allow for configuring it.


That's really interesting ;)

I guess it's likely that people are already relying on that behavior,
even if it's flawed.

> 
> ksm_madvise(MADV_MERGEABLE) succeeds on !vma_ksm_compatible(), so it
> wouldn't be functional breakage, but may result in unexpected increase
> of memory consumption.

Hmm... it's hard to argue that nothing is broken.

An application losing the memory savings from KSM might not fit in
memory at all and could be taken down by the OOM killer ;p

> 
>> Interestingly, mm_populate()->populate_vma_page_range() wants to break COW.
>> [*]
>>
>> But if the app later calls fork(), we still allow for cow-sharing pages with
>> the child. (another case of "don't do it", like KSM I guess)
> 
> CoW has bunch of these "don't do it". :P
> 
>> [*] it doesn't do it for mappings that start out R/O. I think we might end
>> up with sharedzero pages in that case, but not sure if worth fixing.
>>
>> -- 
>> Cheers
>>
>> David / dhildenb
>>
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ