lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260121080348.36253-1-lizhe.67@bytedance.com>
Date: Wed, 21 Jan 2026 16:03:48 +0800
From: "Li Zhe" <lizhe.67@...edance.com>
To: <gourry@...rry.net>
Cc: <akpm@...ux-foundation.org>, <ankur.a.arora@...cle.com>, 
	<dan.j.williams@...el.com>, <dave@...olabs.net>, 
	<david.laight.linux@...il.com>, <david@...nel.org>, <fvdl@...gle.com>, 
	<joao.m.martins@...cle.com>, <jonathan.cameron@...wei.com>, 
	<linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>, 
	<linux-mm@...ck.org>, <lizhe.67@...edance.com>, <mhocko@...e.com>, 
	<mjguzik@...il.com>, <muchun.song@...ux.dev>, <osalvador@...e.de>, 
	<raghavendra.kt@....com>, <wangzhou1@...ilicon.com>, 
	<zhanjie9@...ilicon.com>
Subject: Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism

On Tue, 20 Jan 2026 13:18:19 -0500, gourry@...rry.net wrote:

> On Tue, Jan 20, 2026 at 06:39:48PM +0800, Li Zhe wrote:
> > On Tue, 20 Jan 2026 09:47:44 +0000, david.laight.linux@...il.com wrote:
> > 
> > > On Tue, 20 Jan 2026 14:27:06 +0800
> > > "Li Zhe" <lizhe.67@...edance.com> wrote:
> > > 
> > > > In light of the preceding discussion, we appear to have reached the
> > > > following understanding:
> > > > 
> > > > (1) At present we prefer to mitigate slow application startup (e.g.,
> > > > VM creation) by zeroing huge pages at the moment they are freed
> > > > (init_on_free). The principal benefit is that user space gains the
> > > > performance improvement without deploying any additional user space
> > > > daemon.
> > > 
> > > Am I missing something?
> > > If userspace does:
> > > $ program_a; program_b
> > > and pages used by program_a are zeroed when it exits you get the delay
> > > for zeroing all the pages it used before program_b starts.
> > > OTOH if the zeroing is deferred program_b only needs to zero the pages
> > > it needs to start (and there may be some lurking).
> > 
> > Under the init_on-free approach, improving the speed of zeroing may
> > indeed prove necessary.
> > 
> > However, I believe we should first reach consensus on adopting
> > "init_on_free" as the solution to slow application startup before
> > turning to performance tuning.
> > 
> 
> His point was init_on_free may not actually reduce any delays on serial
> applications, and can actually introduce additional delays.
> 
> Example
> -------
> program_a:  alloc_hugepages(10);
>             exit();
> 
> program b:  alloc_hugepages(5);
> 	    exit();
> 
> /* Run programs in serial */
> sh:  program_a && program_b
> 
> in zero_on_alloc():
> 	program_a eats zero(10) cost on startup
> 	program_b eats zero(5) cost on startup
> 	Overall zero(15) cost to start program_b
> 
> in zero_on_free()
> 	program_a eats zero(10) cost on startup
> 	program_a eats zero(10) cost on exit
> 	program_b eats zero(0) cost on startup
> 	Overall zero(20) cost to start program_b
> 
> zero_on_free is worse by zero(5)
> -------
> 
> This is a trivial example, but it's unclear zero_on_free actually
> provides a benefit.  You have to know ahead of time what the runtime
> behavior, pre-zeroed count, and allocation pattern (0->10->5->...) would
> be to determine whether there's an actual reduction in startup time.
> 
> But just trivially, starting from the base case of no pages being
> zeroed, you're just injecting an additional zero(X) cost if program_a()
> consumes more hugepages than program_b().
> 
> Long way of saying the shift from alloc to free seems heuristic-y and
> you need stronger analysis / better data to show this change is actually
> beneficial in the general case.

I understand your concern. At some point some process must pay the
cost of zeroing, and the optimal strategy is inevitably
workload-dependent.

Our "zero-on-free for huge pages" draws on the existing kernel
init_on_free mechanism. Of course, it may prove sub-optimal in certain
scenarios.

Consistent with "provide tools, not policy", perhaps the decision is
better left to user space. And that is exactly what this patchset
does. Requiring a userspace daemon to decide when to zero pages
certainly adds complexity, but it also gives administrators a single,
flexible knob that can be tuned for any workload.

Thanks,
Zhe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ