[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <EF16AFD9-DDBB-4FB0-BF70-B7282159EDB1@nvidia.com>
Date: Thu, 15 May 2025 14:42:02 -0400
From: Zi Yan <ziy@...dia.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
David Hildenbrand <david@...hat.com>, Usama Arif <usamaarif642@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
hannes@...xchg.org, shakeel.butt@...ux.dev, riel@...riel.com,
laoar.shao@...il.com, baolin.wang@...ux.alibaba.com, npache@...hat.com,
ryan.roberts@....com, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the
process
On 15 May 2025, at 14:21, Lorenzo Stoakes wrote:
> On Thu, May 15, 2025 at 02:09:56PM -0400, Liam R. Howlett wrote:
>> * David Hildenbrand <david@...hat.com> [250515 13:30]:
>>>>>
>>>>
>>>> Did we document all this? :)
>>>>
>>>> It'd be good to be super explicit about these sorts of 'dependency chains'.
>>>>
>>>
>>> Documentation/admin-guide/mm/transhuge.rst has under "Global THP controls"
>>> quite some stuff about all that, yes.
>>>
>>> The whole document needs an overhaul, to clarify on the whole terminology,
>>> make it consistent, and better explain how the pagecache behaves etc. On my
>>> todo list, but I'm afraid it will be a bit of work to get it right / please
>>> most people.
>>
>> Yes, the whole thing is making me grumpy (more than my default state).
>> The more I think about it, the more I don't like the prctl approach
>> either...
>
> prctl() feels like it's literally never, ever the right choice.
>
> It feels like we shove all the dark stuff we want to put under the rug
> there.
>
> Reading the man page is genuinely frightening. there's stuff about VMAs _I
> wasn't aware of_.
>
> It's also never really the _right time_ to do it - it's not process
> inception is it? It's when the process has started, now you suddenly fiddle
> with it.
>
> Then relying on mm flags being propagated over fork/exec is just, it's a
> hack really.
>
>>
>> I more than dislike flags2... I hate it.
>
> Yeah, to be clear - I will NACK any series that tries to add flags2 unless
> a VERY VERY good justification is given. It's horrid. And frankly this
> feature doesn't warrant something as horrible.
>
> But making mm->flags 64-bit on 32-bit kernels (which are in effect
> deprecated in my view) would fix this.
>
>>
>> but no prctl, no cgroups, no bpf.. what is left? A new policy groups
>> thing? No, not that either, please.
BPF might be OK, as long as we provide right functions for BPF to manipulate
system, process, MM, VMA level knobs. My only objection to Yafang's patch[1] is
that the patch adds a VMA parameter to the global hugepage checking functions.
My take on BPF approach is that it does not add new APIs, so we can change it
at any time, assuming people is willing to accept that the functions instrumented
by BPF can go away at any time and the corresponding BPF programs will not work
forever. It allows us to explore various huge page policies without the burden
of maintaining APIs. Eventually, huge page policies become transparent after
we learn enough.
[1] https://lore.kernel.org/linux-mm/20250429024139.34365-1-laoar.shao@gmail.com/
--
Best Regards,
Yan, Zi
Powered by blists - more mailing lists