linux-kernel - Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d9fbc00d-182b-4463-b718-73c2f0eef3a0@kernel.org>
Date: Wed, 14 Jan 2026 13:33:58 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: Li Zhe <lizhe.67@...edance.com>, akpm@...ux-foundation.org,
 ankur.a.arora@...cle.com, fvdl@...gle.com, joao.m.martins@...cle.com,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhocko@...e.com,
 muchun.song@...ux.dev, osalvador@...e.de, raghavendra.kt@....com
Subject: Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism

On 1/14/26 13:11, Mateusz Guzik wrote:
> On Wed, Jan 14, 2026 at 12:55 PM David Hildenbrand (Red Hat)
> <david@...nel.org> wrote:
>> You said "I wonder if implementing hugepage pre-zeroing directly within
>> the kernel would be a simpler and more direct way to accelerate VM
>> creation".
>>
>> And I agree. But to make that fly (no user space polling interface), I
>> was wondering whether we could do it like "init_on_free" and let whoever
>> frees a hugetlb folio just reinitialize it with 0.
>>
>> No kernel thread, no user space thread involved.
>>
> 
> i don't see how this is supposed to address the stated problem of
> zeroing being incredibly expensive.

The price of zeroing has to be paid somewhere.

Currently it's done at allocation time, we could move it to freeing time.

That would make application startup faster and application shutdown slower.

And we're aware that application shutdown can be expensive, which is why 
e.g., QEMU implements an async shutdown operation, where the MM gets 
torn down from another process.

> 
> With machinery to pre-zero and depending on availability of CPU time +
> pages eligible for allocation but not yet zeroed vs vm
> startups/teardowns frequency, there is some amount of real time which
> wont be spent waiting on said zeroing because it was already done.
> 
> Any approach which keeps the overhead with the program allocating the
> page can't take advantage of it, even if said overhead is paid at the
> end of its life.

Let's read again at the main use case of this change here is, as stated:

"... there are some use cases where a large number of hugetlb
pages are touched when an application (such as a VM backed by these
pages) starts. For 256 1G pages and 40ms per page, this would take
10 seconds, a noticeable delay."

-- 
Cheers

David