linux-kernel - Re: [PATCH v7 0/4] Introduce mseal()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <85359.1706036321@cvs.openbsd.org>
Date: Tue, 23 Jan 2024 11:58:41 -0700
From: "Theo de Raadt" <deraadt@...nbsd.org>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
    Jeff Xu <jeffxu@...omium.org>, akpm@...ux-foundation.org,
    keescook@...omium.org, jannh@...gle.com, sroettger@...gle.com,
    willy@...radead.org, gregkh@...uxfoundation.org,
    torvalds@...ux-foundation.org, usama.anjum@...labora.com,
    rdunlap@...radead.org, jeffxu@...gle.com, jorgelo@...omium.org,
    groeck@...omium.org, linux-kernel@...r.kernel.org,
    linux-kselftest@...r.kernel.org, linux-mm@...ck.org,
    pedro.falcato@...il.com, dave.hansen@...el.com,
    linux-hardening@...r.kernel.org
Subject: Re: [PATCH v7 0/4] Introduce mseal()

Liam R. Howlett <Liam.Howlett@...cle.com> wrote:

> * Theo de Raadt <deraadt@...nbsd.org> [240122 17:35]:
> > Jeff Xu <jeffxu@...omium.org> wrote:
> > 
> > > On Mon, Jan 22, 2024 at 7:49 AM Theo de Raadt <deraadt@...nbsd.org> wrote:
> > > >
> > > > Regarding these pieces
> > > >
> > > > > The PROT_SEAL bit in prot field of mmap(). When present, it marks
> > > > > the map sealed since creation.
> > > >
> > > > OpenBSD won't be doing this.  I had PROT_IMMUTABLE as a draft.  In my
> > > > research I found basically zero circumstances when you userland does
> > > > that.  The most common circumstance is you create a RW mapping, fill it,
> > > > and then change to a more restrictve mapping, and lock it.
> > > >
> > > > There are a few regions in the addressspace that can be locked while RW.
> > > > For instance, the stack.  But the kernel does that, not userland.  I
> > > > found regions where the kernel wants to do this to the address space,
> > > > but there is no need to export useless functionality to userland.
> > > >
> > > I have a feeling that most apps that need to use mmap() in their code
> > > are likely using RW mappings. Adding sealing to mmap() could stop
> > > those mappings from being executable. Of course, those apps would
> > > need to change their code. We can't do it for them.
> > 
> > I don't have a feeling about it.
> > 
> > I spent a year engineering a complete system which exercises the maximum
> > amount of memory you can lock.
> > 
> > I saw nothing like what you are describing.  I had PROT_IMMUTABLE in my
> > drafts, and saw it turning into a dangerous anti-pattern.
> > 
> > > Also, I believe adding this to mmap() has no downsides, only
> > > performance gain, as Pedro Falcato pointed out in [1].
> > > 
> > > [1] https://lore.kernel.org/lkml/CAKbZUD2A+=bp_sd+Q0Yif7NJqMu8p__eb4yguq0agEcmLH8SDQ@mail.gmail.com/
> > 
> > Are you joking?  You don't have any code doing that today.  More feelings?
> 
> The 'no downside" is to combining two calls together; mmap() & mseal(),
> at least that is how I read the linked discussion.
> 
> The common case (since there are no users today) of just calling
> mmap()/munmap() will have the downside.
> 
> There will be a performance impact once you have can_modify_mm() doing
> more than just returning true.  Certainly, the impact will be larger
> in munmap where multiple VMAs may need to be checked (assuming that's
> the plan?).
> 
> This will require a new and earlier walk of the vma tree while holding
> the mmap_lock.  Since you are checking (potentially multiple) VMAs for
> something, I don't think there is a way around holding the lock.
> 
> I'm not saying the cost will be large, but it will be a positive
> non-zero number.

For future glibc changes, I predict you will have zero cases where you
can call mmap+immutable or mprotect+immutable, I say so, because I ended
up having none.  You always have to fill the memory.  (At first glance
you might think it works for a new DSO's BSS, but RELRO overlaps it, and
since RELRO mprotect happens quite late, the permission locking is quite
delayed relative to the allocation).

I think chrome also won't lock memory at allocation.  I suspect the
generic allocator is quite seperate from the code using the allocation,
which knows which objects can have their permissions locked and which
objects can't.

In OpenBSD, the only cases where we could set immutable at the same time
as creating the mapping was in execve, for a new process's stack regions,
and that is kernel code, not the userland exposed system call APIs.
 
This change could skip adding PROT_MSEAL today, and add it later when
there are facts the need.


It's the same with MAP_MSEALABLE.  I don't get it. So now there are 3
memory types:
       - cannot be sealed, ever
       - not yet sealed
       - sealed

What purpose does the first type serve?  Please explain the use case.

Today, processes have control over their entire address space.

What is the purpose of "permissions cannot be locked".  Please supply
an example.  If I am wrong, I'd like to know where I went wrong.