lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <955fd396-10b1-48cb-977d-74f3e158b1cd@redhat.com>
Date: Mon, 26 May 2025 14:57:17 +0200
From: David Hildenbrand <david@...hat.com>
To: Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 "Liam R . Howlett" <Liam.Howlett@...cle.com>,
 Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
 Arnd Bergmann <arnd@...db.de>, Christian Brauner <brauner@...nel.org>,
 linux-mm@...ck.org, linux-arch@...r.kernel.org,
 linux-kernel@...r.kernel.org, SeongJae Park <sj@...nel.org>,
 Usama Arif <usamaarif642@...il.com>
Subject: Re: [RFC PATCH 0/5] add process_madvise() flags to modify behaviour

>>
>> To summarize my current view:
>>
>> 1) ebpf: most people are are not a fan of that, and I agree, at least
>>     for this purpose. If we were talking about making better *placement*
>>     decisions using epbf, it would be a different story.
> 
>  From placement decisions, do you mean placement between memory
> tiers/nodes or something else?

More like: which size to place, but it could be extended to other 
policies, maybe.

Assume we have a page fault and have to decide which size to place.

For a process that we really want to use THPs (VM_HUEPAGE?), we could 
use the largest free folio possible.

For a process that we don't want to spend valuable THPs on (VM_HUEPAGE 
not set?), we could use the smallest free folio possible.

Such a possibly might be encoded in an ebpf program I assume.

The hints (prioritize regions/processes, deprioritize 
regions/processes), such as VM_HUGEPAGE, inputs into such a program.
> 
>>
>> 2) prctl(): the unloved child, and I can understand why. Maybe now is
>>     the right time to stop adding new MM things that feel weird in there.
>>     Maybe we should already have done that with the KSM toggle (guess who
>>     was involved in that ;) ).
> 
> At the moment systemd is the user I know of and I think it would very
> easy to migrate it to whatever new thing we decide here.

Agreed.

> 
>>
>> 3) process_madvise(): I think it's an interesting extension, but
>>     probably we should just have something that applies to the whole
>>     address space naturally. At least my take for now.
>>
>> 4) new syscall: worth exploring how it would look. I'm especially
>>     interested in flag options (e.g., SET_DEFAULT_EXEC) and how we could
>>     make them only apply to selected controls.
> 
> Were there any previous discussion on SET_DEFAULT_EXEC? First time I am
> hearing about it.

I think it evolved in the discussion here from PMADV_SET_FORK_EXEC_DEFAULT.

> 
> Overall I agree with your assessment and thus I was requesting to at
> least discuss the new syscall option as well.

Yes.

I am still not sure if having a new "process" [1] mode would be a 
reasonable alternative to setting the VM_HUGEPAGE/VM_NOHUGEPAGE default. 
Assuming we would have a "process" mode, we could (a) set the policy 
per-process using the new syscall we discuss here, and options to (B) 
set the policy to use for the exec child and (c) maybe an option to seal 
the policy (depending on who is allowed to set the policy in the first 
place).

On the + side, we don't lose hints/instructions from the app 
(VM_HUGEPAGE/VM_NOHUGEPAGE) when changing the policy on an already 
running process.

The problem I see with the "process" policy is that people might want 
different "default" policies for processes, which means that we will 
have to add yet another toggle.


How I hate THP toggles. :)

[1] 
https://lore.kernel.org/all/CALOAHbB-KQ4+z-Lupv7RcxArfjX7qtWcrboMDdT4LdpoTXOMyw@mail.gmail.com/

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ