linux-kernel - Re: [RFC PATCH 4/5] mm/madvise: add PMADV_SET_FORK_EXEC_DEFAULT process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <da1281bb-e49a-40f9-ac11-f976358e618e@lucifer.local>
Date: Tue, 20 May 2025 11:21:33 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Pedro Falcato <pfalcato@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        David Hildenbrand <david@...hat.com>, Vlastimil Babka <vbabka@...e.cz>,
        Jann Horn <jannh@...gle.com>, Arnd Bergmann <arnd@...db.de>,
        Christian Brauner <brauner@...nel.org>, linux-mm@...ck.org,
        linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
        SeongJae Park <sj@...nel.org>, Usama Arif <usamaarif642@...il.com>
Subject: Re: [RFC PATCH 4/5] mm/madvise: add PMADV_SET_FORK_EXEC_DEFAULT
 process_madvise() flag

On Tue, May 20, 2025 at 09:38:50AM +0100, Pedro Falcato wrote:
> On Mon, May 19, 2025 at 09:52:41PM +0100, Lorenzo Stoakes wrote:
> > It's useful in certain cases to be able to default-enable an madvise() flag
> > for all newly mapped VMAs, and for that to survive fork/exec.
> >
> > The natural place to specify something like this is in an madvise()
> > invocation, and thus providing this functionality as a flag to
> > process_madvise() makes sense.
> >
> > We intentionally limit this only to flags that we know should function
> > correctly without issue, and to be conservative about this, so we initially
> > limit ourselves only to MADV_HUGEPAGE, MADV_NOHUGEPAGE, that is - setting
> > the VM_HUGEPAGE, VM_NOHUGEPAGE VMA flags.
> >
> > We implement this functionality by using the mm_struct->def_flags field.
>
> This seems super specific. How about this:
>
> - PMADV_FUTURE (mirrors MCL_FUTURE). This only applies the flag to future VMAs in the current process.
> - PMADV_INHERIT_FORK. This makes it so the flag is propagated to child processes (does not imply PMADV_FUTURE)
> - PMADV_INHERIT_EXEC. This makes it so the flag is propagated through the execve boundary
>   (and this is where we'd filter for 'safe' flags, at least through the secureexec boundary). Does not imply
>   FUTURE nor INHERIT_FORK.

I don't know how we could implement separate current process, fork, exec, fork/exec.
mm->def_flags is propagated this way automatically.

And again on the security stuff, I think the correct answer is to require sys
admin capability to be able to use this option _at all_. This simplifies
everything.

To have this kind of thing we'd have to add a whole new mechanism, literally
just for this, and I'd really rather not generate brand new mm_struct flags for
every possible mode (in fact that would probably makes the whole thing
intractible), or add a new field there for this.

The idea is that we get the advantages of an improved madvise interface, while
also providing the interface Usama wants without having to add some hideous
prctl() whose logic is disconnected from the rest of madvise(), while being, in
effect, a 'default madvise() for new mappings'.

So while specific to the case, nothing prevents us in future adding more
functionality if we want.

We could also potentially:

- add PMADV_SET_DEFAULT (I'm iffy about PMADV_FUTURE... but whichever we go with)
- add PMADV_INHERIT_FORK
- add PMADV_INHERIT_EXEC

And only support PMADV_SET_DEFAULT | PMADV_INHERIT_FORK | PMADV_INHERIT_EXEC for
now.

THen we could have the security semantics you specify (require cap sys admin on
PMADV_INHERIT_EXEC) but have that propagate to the only supported case.

What do you think?

>
> and, while we're at it, rename PMADV_ENTIRE_ADDRESS_SPACE to PMADV_CURRENT, to align it with MCL_CURRENT.

I'm not sure making the mlock()/madvise() stuff analagous is a good idea, as
they have different semantics. I'd rather keep these flags descriptive. Though
I'm open to alternative naming of course...

Also keep in mind these flags are for mlockall(), whose name already tells you
you're locking everything :)

>
> Thoughts?
>
> --
> Pedro