lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 15 Mar 2024 13:18:24 +0100
From: David Hildenbrand <david@...hat.com>
To: Lance Yang <ioworker0@...il.com>
Cc: akpm@...ux-foundation.org, mhocko@...e.com, zokeefe@...gle.com,
 shy828301@...il.com, xiehuan09@...il.com, songmuchun@...edance.com,
 minchan@...nel.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 libang.li@...group.com
Subject: Re: [PATCH 1/1] mm/khugepaged: reduce process visible downtime by
 pre-zeroing hugepage

On 14.03.24 15:19, Lance Yang wrote:
> Another thought suggested by Bang Li is that we record which pte is none
> in hpage_collapse_scan_pmd. Then, before acquiring the mmap_lock (write mode),
> we will pre-zero pages as needed.

Here is my point of view: we cannot optimize the common case where we 
have mostly !pte_none() in a similar way.

So why do we care about the less common case? Is the process visible 
downtime reduction for that less common case really noticable?

Or is it rather something that looks good in a micro-benchmark, but 
won't really make any difference in actual applications (again, where 
the common case will still result the same downtime).

I'm not against this, I'm rather wonder "do we really care". I'd like to 
hear other opinions.

>>>>>
>>>>> So my question is: do we really care about it that much that we care to
>>>>> optimize?
>>>>
>>>> IMO, although it may not be our main concern, reducing the impact of
>>>> khugepaged on the process remains crucial. I think that users also prefer
>>>> minimal interference from khugepaged.
>>>
>>> The problem I am having with this is that for the *common* case where we
>>> have a small number of pte_none(), we cannot really optimize because we
>>> have to perform the copy.
>>>
>>> So this feels like we're rather optimizing a corner case, and I am not
>>> so sure if that is really worth it.
>>>
>>> Other thoughts?
>>
>> Another thought is to introduce khugepaged/alloc_zeroed_hpage for THP
>> sysfs settings. This would enable users to decide whether to avoid unnecessary
>> copies when nr_ptes_none > 0.

Hm, new toggles for that, not sure ... I much rather prefer something 
without any new toggles, especially for this case.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ