lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPTztWbceW0dbCPVMw_maer8o_o851Jf-omOBCQkwwA9qQP2qg@mail.gmail.com>
Date: Wed, 4 Dec 2024 09:01:25 -0800
From: Frank van der Linden <fvdl@...gle.com>
To: Ankur Arora <ankur.a.arora@...cle.com>
Cc: Mateusz Guzik <mjguzik@...il.com>, linux-mm@...ck.org, akpm@...ux-foundation.org, 
	Muchun Song <muchun.song@...ux.dev>, Miaohe Lin <linmiaohe@...wei.com>, 
	Oscar Salvador <osalvador@...e.de>, David Hildenbrand <david@...hat.com>, Peter Xu <peterx@...hat.com>, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/hugetlb: optionally pre-zero hugetlb pages

On Tue, Dec 3, 2024 at 4:05 PM Ankur Arora <ankur.a.arora@...cle.com> wrote:
>
>
> Mateusz Guzik <mjguzik@...il.com> writes:
>
> > On Mon, Dec 02, 2024 at 08:20:58PM +0000, Frank van der Linden wrote:
> >> Fresh hugetlb pages are zeroed out when they are faulted in,
> >> just like with all other page types. This can take up a good
> >> amount of time for larger page sizes (e.g. around 40
> >> milliseconds for a 1G page on a recent AMD-based system).
> >>
> >> This normally isn't a problem, since hugetlb pages are typically
> >> mapped by the application for a long time, and the initial
> >> delay when touching them isn't much of an issue.
> >>
> >> However, there are some use cases where a large number of hugetlb
> >> pages are touched when an application (such as a VM backed by these
> >> pages) starts. For 256 1G pages and 40ms per page, this would take
> >> 10 seconds, a noticeable delay.
> >
> > The current huge page zeroing code is not that great to begin with.
>
> Yeah definitely suboptimal. The current huge page zeroing code is
> both slow and it trashes the cache while zeroing.
>
> > There was a patchset posted some time ago to remedy at least some of it:
> > https://lore.kernel.org/all/20230830184958.2333078-1-ankur.a.arora@oracle.com/
> >
> > but it apparently fell through the cracks.
>
> As Joao mentioned that got side tracked due to the preempt-lazy stuff.
> Now that lazy is in, I plan to follow up on the zeroing work.
>
> > Any games with "background zeroing" are notoriously crappy and I would
> > argue one should exhaust other avenues before going there -- at the end
> > of the day the cost of zeroing will have to get paid.
>
> Yeah and the background zeroing has dual cost: the cost in CPU time plus
> the indirect cost to other processes due to the trashing of L3 etc.

I'm not sure what you mean here - any caching side effects of zeroing
happen regardless of who does it, right? It doesn't matter if it's a
kthread or the calling thread.

If you're concerned about the caching side effects in general, using
non-temporal instructions helps (e.g. movnti on x86). See the link I
mentioned for a patch that was sent years ago (
https://lore.kernel.org/all/20180725023728.44630-1-cannonmatthews@google.com/
). Using movnti on x86 definitely helps performance (up to 50% in my
experiments). Which is great, but it still leaves considerable delay
for the use case I mentioned.

- Frank

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ