linux-kernel - Re: [RFC 0/2] mm: introduce THP deferred setting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9cf237df1a7bb21bba1a464787938eba8f372658.camel@surriel.com>
Date: Tue, 27 Aug 2024 21:18:58 -0400
From: Rik van Riel <riel@...riel.com>
To: Johannes Weiner <hannes@...xchg.org>, Usama Arif <usamaarif642@...il.com>
Cc: Nico Pache <npache@...hat.com>, linux-mm@...ck.org, 
 linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org, Andrew Morton
 <akpm@...ux-foundation.org>, David Hildenbrand <david@...hat.com>, Matthew
 Wilcox <willy@...radead.org>, Barry Song <baohua@...nel.org>, Ryan Roberts
 <ryan.roberts@....com>,  Baolin Wang <baolin.wang@...ux.alibaba.com>, Lance
 Yang <ioworker0@...il.com>, Peter Xu <peterx@...hat.com>,  Rafael Aquini
 <aquini@...hat.com>, Andrea Arcangeli <aarcange@...hat.com>, Jonathan
 Corbet <corbet@....net>,  "Kirill A . Shutemov"
 <kirill.shutemov@...ux.intel.com>, Zi Yan <ziy@...dia.com>
Subject: Re: [RFC 0/2] mm: introduce THP deferred setting

On Tue, 2024-08-27 at 13:09 +0200, Johannes Weiner wrote:
> 
> I agree with this. The defer mode is an improvement over the upstream
> status quo, no doubt. However, both defer mode and the shrinker solve
> the issue of memory waste under pressure, while the shrinker permits
> more desirable behavior when memory is abundant.
> 
> So my take is that the shrinker is the way to go, and I don't see a
> bonafide usecase for defer mode that the shrinker couldn't cover.
> 
> 
I would like to take one step back, and think about what some real
world workloads might want as a tunable for THP.

Workload owners are going to have a real problem trying to figure
out what the best value of max_ptes_none should be for their
workloads.

However, giving workload owners the ability to say "this workload
should not waste more than 1GB of memory on zero pages inside THPs",
or 500MB, or 4GB or whatever, would then allow the kernel to
automatically adjust the max_ptes_none threshold.

Once a workload is close to, or exceeding the maximum amount of
THP zero page overhead, we could both shrink THPs, and disable
direct THP allocation at page fault time for that workload.

If we want to give workload owners a predictable, easy to work
with tunable, we probably want both the shrinker and the deferred
allocation.

-- 
All Rights Reversed.