lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <692f9624-e440-4cf2-8202-861c679ddb73@lucifer.local>
Date: Fri, 25 Jul 2025 08:01:59 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        Pedro Falcato <pfalcato@...e.de>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Jeff Xu <jeffxu@...omium.org>
Subject: Re: [PATCH v3 2/5] mm/mseal: update madvise() logic

On Fri, Jul 25, 2025 at 12:12:54AM +0200, David Hildenbrand wrote:
> On 16.07.25 19:38, Lorenzo Stoakes wrote:
> > The madvise() logic is inexplicably performed in mm/mseal.c - this ought
> > to be located in mm/madvise.c.
> >
> > Additionally can_modify_vma_madv() is inconsistently named and, in
> > combination with is_ro_anon(), is very confusing logic.
> >
> > Put a static function in mm/madvise.c instead - can_madvise_modify() -
> > that spells out exactly what's happening.  Also explicitly check for an
> > anon VMA.
> >
> > Also add commentary to explain what's going on.
> >
> > Essentially - we disallow discarding of data in mseal()'d mappings in
> > instances where the user couldn't otherwise write to that data.
> >
> > Shared mappings are always backed, so no discard will actually truly
> > discard the data.  Read-only anonymous and MAP_PRIVATE file-backed
> > mappings are the ones we are interested in.
> >
> > We make a change to the logic here to correct a mistake - we must disallow
> > discard of read-only MAP_PRIVATE file-backed mappings, which previously we
> > were not.
> >
> > The justification for this change is to account for the case where:
> >
> > 1. A MAP_PRIVATE R/W file-backed mapping is established.
> > 2. The mapping is written to, which backs it with anonymous memory.
> > 3. The mapping is mprotect()'d read-only.
> > 4. The mapping is mseal()'d.
>
> Thinking about this a bit (should have realized this implication earlier)

Well none of us did...

> ... assuming we have:
>
> 1. A MAP_PRIVATE R/O file-backed mapping.
> 2. The mapping is mseal()'d.
>
> We only really have anon folios in there with things like (a) uprobe (b)
> debugger access (c) similarly weird FOLL_FORCE stuff.
>
> Now, most executables/libraries are mapped that way. If someone would rely
> on MADV_DONTNEED to zap pages in there (to free up memory), that would get
> rejected.

Right, yes.

This is odd behaviour to me. But I guess this is what Jeff meant by 'detecting
this' in android.

The documentation is really not specific enough, we need to fix that. It's
effectively stating any anon mappings are sealed, which is just not true with
existing semantics.

However I see:

	Memory sealing can automatically be applied by the runtime loader to
	seal .text and .rodata pages and applications can additionally seal
	security critical data at runtime.

So yes, we're going to break MADV_DONTNEED of this mappings.

BUT.

Would you really want to MADV_DONTNEED away uprobes etc.?? That seems... very
strange and broken behaviour no?

Note that, also, mappings of read-only files have VM_SHARED stripped. So they
become read-only (With ~VM_MAYWRITE).

To be clear this is where the mode of the file is read-only, not that the
mapping is read-only alone.

So with this change, we'd disallow discard of this.

It'd be pretty odd to mseal() a read-only file-backed mapping and then try to
discard, but maybe somebody would weirdly rely upon this?

It's inconsistent, as a person MAP_SHARED mapping a file that is read/write but
mapped read-only (or r/w of course), can discard fine eve if sealed, but if the
file happens to be read-only can't.

But we could add a VM_MAYWRITE check also.

OK maybe I"m softening on the anon_vma thing see below.

So we could combine these checks to avoid these issues.


>
> Does something like that rely on MADV_DONTNEED working? Good question.

Kees/Jeff? Can you check if android relies on this?

>
> Checking for anon_vma in addition, ad mentioned in the other thread, would
> be a "cheap" check to rule out that there are currently anon vmas in there.
>
> Well, not 100% reliable, because MADV_DONTNEED can race with page faults ...

But hang on, it's read-only so we shouldn't get racing faults... right?

Hmm maybe I'll soften on this anon_vma idea then. Maybe it is a 'cheap fix' to
rule out the _usual_ cases.

We're not changing zapping logic for this though, sorry. That's just a crazy
length to go to.

>
> --
> Cheers,
>
> David / dhildenb
>

In any case, I'm going to send a version of this with the controversial bit
stripped so we (hopefully) land the refactorings for 6.17 and can change
semantics if necessary for 6.18.

Cheers, Lorenzo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ