lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260113064155.29900-1-lizhe.67@bytedance.com>
Date: Tue, 13 Jan 2026 14:41:54 +0800
From: "Li Zhe" <lizhe.67@...edance.com>
To: <ankur.a.arora@...cle.com>
Cc: <akpm@...ux-foundation.org>, <david@...nel.org>, <fvdl@...gle.com>, 
	<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>, 
	<lizhe.67@...edance.com>, <muchun.song@...ux.dev>, <osalvador@...e.de>
Subject: Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism

On Mon, 12 Jan 2026 14:01:29 -0800, ankur.a.arora@...cle.com wrote:

> > In user space, we can use system calls such as epoll and write to zero
> > huge folios as they become available, and sleep when none are ready. The
> > following pseudocode illustrates this approach. The pseudocode spawns
> > eight threads (each running thread_fun()) that wait for huge pages on
> > node 0 to become eligible for zeroing; whenever such pages are available,
> > the threads clear them in parallel.
> >
> >   static void thread_fun(void)
> >   {
> >   	epoll_create();
> >   	epoll_ctl();
> >   	while (1) {
> >   		val = read("/sys/devices/system/node/node0/hugepages/hugepages-1048576kB/zeroable_hugepages");
> >   		if (val > 0)
> >   			system("echo max > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/zeroable_hugepages");
> >   		epoll_wait();
> >   	}
> >   }
> 
> Given that zeroable_hugepages is per node, anybody who writes to
> it would need to know how much the aggregate demand would be.
> 
> Seems to me that the only value that might make sense would be "max".
> And at that point this approach seems a little bit like init_on_free.

Yes, writing “max” suffices for the vast majority of workloads.

However, once multiple mutually independent application processes each
need huge pages, the ability to specify an exact value becomes
essential, because the CPU time each process spends on zeroing can
then be charged to its own cgroup. If we currently considers “max”
sufficient, we can implement support for that parameter alone and
extend it later when necessary.

Although “max” resembles init_on_free at first glance, it leaves the
decision of “when and on which CPU to zero” entirely to user space,
thereby eliminating the concern previously raised.

Thanks,
Zhe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ