lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9433c2d6-200c-4320-80f3-840ca5e66f64@redhat.com>
Date: Thu, 22 May 2025 15:05:30 +0200
From: David Hildenbrand <david@...hat.com>
To: Shakeel Butt <shakeel.butt@...ux.dev>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 "Liam R . Howlett" <Liam.Howlett@...cle.com>,
 Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
 Arnd Bergmann <arnd@...db.de>, Christian Brauner <brauner@...nel.org>,
 linux-mm@...ck.org, linux-arch@...r.kernel.org,
 linux-kernel@...r.kernel.org, SeongJae Park <sj@...nel.org>,
 Usama Arif <usamaarif642@...il.com>
Subject: Re: [RFC PATCH 0/5] add process_madvise() flags to modify behaviour

On 21.05.25 19:39, Shakeel Butt wrote:
> On Wed, May 21, 2025 at 05:49:15PM +0100, Lorenzo Stoakes wrote:
> [...]
>>>
>>> Please let's first get consensus on this before starting the work.
>>
>> With respect Shakeel, I'll work on whatever I want, whenever I want.
> 
> I fail to understand why you would respond like that.

Relax guys ... :) Really nothing to be fighting about.

Lorenzo has a lot of energy to play with things, to see how it would 
look. I wish I would have that much energy, but I have no idea where it 
went ... (well, okay, I have a suspicion) :P

At the same time, I hope (and assume :) ) that Lorenzo will get Usama 
involved in the development once we know what we want.


To summarize my current view:

1) ebpf: most people are are not a fan of that, and I agree, at least
    for this purpose. If we were talking about making better *placement*
    decisions using epbf, it would be a different story.

2) prctl(): the unloved child, and I can understand why. Maybe now is
    the right time to stop adding new MM things that feel weird in there.
    Maybe we should already have done that with the KSM toggle (guess who
    was involved in that ;) ).

3) process_madvise(): I think it's an interesting extension, but
    probably we should just have something that applies to the whole
    address space naturally. At least my take for now.

4) new syscall: worth exploring how it would look. I'm especially
    interested in flag options (e.g., SET_DEFAULT_EXEC) and how we could
    make them only apply to selected controls.


An API prototype of 4), not necessarily with the code yet, might be 
valuable.

In general, the "always/madvise/never" policies are really horrible. We 
should instead be prioritizing who gets THPs -- and only disable them 
for selected workloads.

Because splitting THPs up because a process is not allowed to use them, 
thereby increasing memory fragmentation, is really absolutely suboptimal.

But we don't have anything better right now.

So I would hope that we can at least turn the "always/VM_HUGEPAGE" into 
a "prioritize for largest (m)THPs possible" in a distant future.

If only changing the semantics of VM_NOHUGEPAGE to mean "deprioritize 
for THPs" couldn't break userfaultfd ... :( But maybe that can be worked 
around in the future somehow (e.g., when we detect userfaultfd usage, 
not sure, ...).

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ