lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABi2SkXArA+yfodoOxDbTTOpWCbU5Ce1p1HadSo0=CL23K4-dQ@mail.gmail.com>
Date: Thu, 17 Oct 2024 13:57:21 -0700
From: Jeff Xu <jeffxu@...omium.org>
To: Pedro Falcato <pedro.falcato@...il.com>
Cc: akpm@...ux-foundation.org, keescook@...omium.org, 
	torvalds@...ux-foundation.org, usama.anjum@...labora.com, corbet@....net, 
	Liam.Howlett@...cle.com, lorenzo.stoakes@...cle.com, jeffxu@...gle.com, 
	jorgelo@...omium.org, groeck@...omium.org, linux-kernel@...r.kernel.org, 
	linux-kselftest@...r.kernel.org, linux-mm@...ck.org, jannh@...gle.com, 
	sroettger@...gle.com, linux-hardening@...r.kernel.org, willy@...radead.org, 
	gregkh@...uxfoundation.org, deraadt@...nbsd.org, surenb@...gle.com, 
	merimus@...gle.com, rdunlap@...radead.org, stable@...r.kernel.org
Subject: Re: [PATCH v1 1/2] mseal: Two fixes for madvise(MADV_DONTNEED) when sealed

On Thu, Oct 17, 2024 at 1:49 PM Pedro Falcato <pedro.falcato@...il.com> wrote:
>
> On Thu, Oct 17, 2024 at 01:34:53PM -0700, Jeff Xu wrote:
> > Hi Pedro
> >
> > On Thu, Oct 17, 2024 at 12:37 PM Pedro Falcato <pedro.falcato@...il.com> wrote:
> > >
> > > > For PROT_NONE mappings, the previous blocking of
> > > > madvise(MADV_DONTNEED) is unnecessary. As PROT_NONE already prohibits
> > > > memory access, madvise(MADV_DONTNEED) should be allowed to proceed in
> > > > order to free the page.
> > >
> > > I don't get it. Is there an actual use case for this?
> > >
> > Sealing should not over-blocking API that it can allow to pass without
> > security concern, this is a case in that principle.
>
> Well, making the interface simple is also important. OpenBSD's mimmutable()
> doesn't do any of this and it Just Works(tm)...
>
> >
> > There is a user case for this as well: to seal NX stack on android,
> > Android uses PROT_NONE/madvise to set up a guide page to prevent stack
> > run over boundary. So we need to let madvise to pass.
>
> And you need to MADV_DONTNEED this guard page?
>
Yes.

> >
> > > > For file-backed, private, read-only memory mappings, we previously did
> > > > not block the madvise(MADV_DONTNEED). This was based on
> > > > the assumption that the memory's content, being file-backed, could be
> > > > retrieved from the file if accessed again. However, this assumption
> > > > failed to consider scenarios where a mapping is initially created as
> > > > read-write, modified, and subsequently changed to read-only. The newly
> > > > introduced VM_WASWRITE flag addresses this oversight.
> > >
> > > We *do not* need this. It's sufficient to just block discard operations on read-only
> > > private mappings.
> > I think you meant blocking madvise(MADV_DONTNEED) on all read-only
> > private file-backed mappings.
> >
> > I considered that option, but there is a use case for madvise on those
> > mappings that never get modified.
> >
> > Apps can use that to free up RAM. e.g. Considering read-only .text
> > section, which never gets modified, madvise( MADV_DONTNEED) can free
> > up RAM when memory is in-stress, memory will be reclaimed from a
> > backed-file on next read access. Therefore we can't just block all
> > read-only private file-backed mapping, only those that really need to,
> > such as mapping changed from rw=>r (what you described)
>
> Does anyone actually do this? If so, why? WHYYYY?
>
This is a legit use case, I can't argue that it isn't.

> The kernel's page reclaim logic should be perfectly cromulent. Please don't do this.
> MADV_DONTNEED will also not free any pages if those are shared (rather they'll just be unmapped).
>
> If we really need to do this, I'd maybe suggest walking through page tables, looking for
> anon ptes or swap ptes (maybe inside the actual zap code?). But I would really prefer if we
> didn't need to do this.
>
I also considered this route, but it is too complicated. The
copy-on-write pages can be put into a swap file, also there is a huge
page to consider, etc, The complication makes it really difficult to
code it right, also scanning those pages on per VMA level will require
lock and also impact performance.


> --
> Pedro

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ