lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <871f2a76-8ccb-4870-8a87-417371feb0b0@kernel.org>
Date: Wed, 21 Jan 2026 13:41:33 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Gregory Price <gourry@...rry.net>, Li Zhe <lizhe.67@...edance.com>
Cc: david.laight.linux@...il.com, akpm@...ux-foundation.org,
 ankur.a.arora@...cle.com, dan.j.williams@...el.com, dave@...olabs.net,
 fvdl@...gle.com, joao.m.martins@...cle.com, jonathan.cameron@...wei.com,
 linux-cxl@...r.kernel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 mhocko@...e.com, mjguzik@...il.com, muchun.song@...ux.dev,
 osalvador@...e.de, raghavendra.kt@....com, wangzhou1@...ilicon.com,
 zhanjie9@...ilicon.com
Subject: Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism

On 1/20/26 19:18, Gregory Price wrote:
> On Tue, Jan 20, 2026 at 06:39:48PM +0800, Li Zhe wrote:
>> On Tue, 20 Jan 2026 09:47:44 +0000, david.laight.linux@...il.com wrote:
>>
>>> On Tue, 20 Jan 2026 14:27:06 +0800
>>> "Li Zhe" <lizhe.67@...edance.com> wrote:
>>>
>>>
>>> Am I missing something?
>>> If userspace does:
>>> $ program_a; program_b
>>> and pages used by program_a are zeroed when it exits you get the delay
>>> for zeroing all the pages it used before program_b starts.
>>> OTOH if the zeroing is deferred program_b only needs to zero the pages
>>> it needs to start (and there may be some lurking).
>>
>> Under the init_on-free approach, improving the speed of zeroing may
>> indeed prove necessary.
>>
>> However, I believe we should first reach consensus on adopting
>> “init_on_free” as the solution to slow application startup before
>> turning to performance tuning.
>>
> 
> His point was init_on_free may not actually reduce any delays on serial
> applications, and can actually introduce additional delays.
> 
> Example
> -------
> program_a:  alloc_hugepages(10);
>              exit();
> 
> program b:  alloc_hugepages(5);
> 	    exit();
> 
> /* Run programs in serial */
> sh:  program_a && program_b
> 
> in zero_on_alloc():
> 	program_a eats zero(10) cost on startup
> 	program_b eats zero(5) cost on startup
> 	Overall zero(15) cost to start program_b
> 
> in zero_on_free()
> 	program_a eats zero(10) cost on startup
> 	program_a eats zero(10) cost on exit
> 	program_b eats zero(0) cost on startup
> 	Overall zero(20) cost to start program_b
> 
> zero_on_free is worse by zero(5)
> -------
> 
> This is a trivial example, but it's unclear zero_on_free actually
> provides a benefit.  You have to know ahead of time what the runtime
> behavior, pre-zeroed count, and allocation pattern (0->10->5->...) would
> be to determine whether there's an actual reduction in startup time.

For VMs with hugetlb people usually have some spare pages lying around. 
VM startup time is more important for cloud providers than VM shutdown time.

I'm sure there are examples where it is the other way around, but having 
mixed workloads on the system is likely not the highest priority right now.

> 
> But just trivially, starting from the base case of no pages being
> zeroed, you're just injecting an additional zero(X) cost if program_a()
> consumes more hugepages than program_b().


And whatever you do,

program_a()
program_b()

will have to zero the pages.

No asynchronous mechanism will really help.

> 
> Long way of saying the shift from alloc to free seems heuristic-y and
> you need stronger analysis / better data to show this change is actually
> beneficial in the general case.

I think the principle of "the allocator already contains zeroed pages" 
is quite universal and simple.

Whether you want to zero the pages actually when the last reference is 
gone (like we do in the buddy), or have that happen from some 
asynchonrous context is an rather an internal optimization.

-- 
Cheers

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ