[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f691d2e0-5919-4581-8a24-1b543d798ae4@redhat.com>
Date: Sat, 10 May 2025 00:42:59 +0200
From: David Hildenbrand <david@...hat.com>
To: Johannes Weiner <hannes@...xchg.org>, Yafang Shao <laoar.shao@...il.com>
Cc: Usama Arif <usamaarif642@...il.com>, Zi Yan <ziy@...dia.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
shakeel.butt@...ux.dev, riel@...riel.com, baolin.wang@...ux.alibaba.com,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, npache@...hat.com,
ryan.roberts@....com, linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH 0/1] prctl: allow overriding system THP policy to always
>>>> - madvise
>>>> The sysadmin gently encourages the use of THP, but it is only
>>>> enabled when explicitly requested by the application.
>
> And this "user mode" or "manual mode", where applications self-manage
> which parts of userspace they want to enroll.
>
> Both madvise() and unprivileged prctl() should work here as well,
> IMO. There is no policy or security difference between them, it's just
> about granularity and usability.
>
>>>> - never
>>>> The sysadmin discourages the use of THP, and "its use is only permitted
>>>> with explicit approval" .
>
> This one I don't quite agree with, and IMO conflicts with what David
> is saying as well.
Yeah ... "never" does not mean "sometimes" in my reality :)
>
>>> "never" so far means "no thps, no exceptions". We've had serious THP
>>> issues in the past, where our workaround until we sorted out the issue
>>> for affected customers was to force-disable THPs on that system during boot.
>>
>> Right, that reflects the current behavior. What we aim to enhance is
>> by adding the requirement that "its use is only permitted with
>> explicit approval."
>
> I think you're conflating a safety issue with a security issue.
>
> David is saying there can be cases where the kernel is broken, and
> "never" is a production escape hatch to disable the feature until a
> kernel upgrade for the fix is possible. In such a case, it doesn't
> make sense to override this decision based on any sort of workload
> policy, privileged or not.
>
> The way I understand you is that you want enrollment (and/or
> self-management) only for blessed applications. Because you don't
> generally trust workloads in the wild enough to switch the global
> default away from "never", given the semantics of always/madvise.
Assuming "never" means "never" and "always" means "always" ( crazy,
right? :) ), could be make use of "madvise" mode, which essentially
means "VM_HUGEPAGE" takes control?
We'd need
a) A way to enable THP for a process. Changing the default/vma settings
to VM_HUGEPAGE as discussed using a prctl could work.
b) A way to ignore VM_HUGEPAGE for processes. Maybe the existing prctl
to force-disable THPs could work?
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists