lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dfbaa342-632d-4911-a0c5-f1ffe32f9e57@redhat.com>
Date: Sat, 16 Aug 2025 08:40:37 +0200
From: David Hildenbrand <david@...hat.com>
To: Vernon Yang <vernon2gm@...il.com>
Cc: akpm@...ux-foundation.org, lorenzo.stoakes@...cle.com, ziy@...dia.com,
 baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com, npache@...hat.com,
 ryan.roberts@....com, dev.jain@....com, baohua@...nel.org,
 glider@...gle.com, elver@...gle.com, dvyukov@...gle.com, vbabka@...e.cz,
 rppt@...nel.org, surenb@...gle.com, mhocko@...e.com, muchun.song@...ux.dev,
 osalvador@...e.de, shuah@...nel.org, richardcochran@...il.com,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 6/7] mm: memory: add mTHP support for wp

On 15.08.25 17:20, Vernon Yang wrote:
> On Thu, Aug 14, 2025 at 01:58:34PM +0200, David Hildenbrand wrote:
>> On 14.08.25 13:38, Vernon Yang wrote:
>>> Currently pagefaults on anonymous pages support mthp, and hardware
>>> features (such as arm64 contpte) can be used to store multiple ptes in
>>> one TLB entry, reducing the probability of TLB misses. However, when the
>>> process is forked and the cow is triggered again, the above optimization
>>> effect is lost, and only 4KB is requested once at a time.
>>>
>>> Therefore, make pagefault write-protect copy support mthp to maintain the
>>> optimization effect of TLB and improve the efficiency of cow pagefault.
>>>
>>> vm-scalability usemem shows a great improvement,
>>> test using: usemem -n 32 --prealloc --prefault 249062617
>>> (result unit is KB/s, bigger is better)
>>>
>>> |    size     | w/o patch | w/ patch  |  delta  |
>>> |-------------|-----------|-----------|---------|
>>> | baseline 4K | 723041.63 | 717643.21 | -0.75%  |
>>> | mthp 16K    | 732871.14 | 799513.18 | +9.09%  |
>>> | mthp 32K    | 746060.91 | 836261.83 | +12.09% |
>>> | mthp 64K    | 747333.18 | 855570.43 | +14.48% |
>>
>> You're missing two of the most important metrics: COW latency and memory
>> waste.
> 
> OK, I will add the above two test later.
> 
>>
>> Just imagine what happens if you have PMD-sized THP.
>>
>> I would suggest you explore why Redis used to recommend to disable THPs
>> (hint: tail latency due to COW of way-too-large chunks before we do what we
>> do today).
> 
> Thanks for the suggestion, I'm not very familiar with Redis indeed. Currently,
> this series supports small granularity sizes, such as 16KB, and I will also
> test redis-benchmark later to see the severity of tail latency.
> 
>>
>> So staring at usemem micro-benchmark results is a bit misleading.
>>
>> As discussed in the past, I would actually suggest to
>>
>> a) Let khugepaged deal with fixing this up later, keeping CoW path
>>     simpler and faster.
>> b) If we really really have to do this during fault time, limit it to
>>     some order (might even be have to be configurable).
> 
> This is a good way to add a similar shmem_enabled knob after if need.
> 
>>
>> I really think we should keep CoW latency low and instead let khugepaged fix
>> that up later. (Nico is working on mTHP collapse support)
>>
>> [are you handling having a mixture of PageAnonExclusive within a folio
>> properly? Only staring at R/O PTEs is usually insufficient to determine
>> whether you can COW or whether you must reuse].
> 
> There is no extra processing on PageAnonExclusive here, only judging by R/O PTEs,
> thank you for pointing it out, and I will look into how to properly handle
> this situation later.

Yes, but as I said: I much prefer to let khugepaged handle that. I am 
not convinced the complexity here is warranted.

Nico's patches should soon be in shape to collapse mthp. (see the list)

-- 
Cheers

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ