linux-kernel - Re: [NEEDS-REVIEW] Re: [PATCH v11 25/25] x86/cet/shstk: Add arch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <b5b5787b-17ce-4e66-8bc6-ab42ae3e398d@www.fastmail.com>
Date:   Mon, 20 Sep 2021 09:48:10 -0700
From:   "Andy Lutomirski" <luto@...nel.org>
To:     "Rick P Edgecombe" <rick.p.edgecombe@...el.com>,
        "Dave Hansen" <dave.hansen@...el.com>
Cc:     "Balbir Singh" <bsingharora@...il.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        "Eugene Syromiatnikov" <esyr@...hat.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        "Randy Dunlap" <rdunlap@...radead.org>,
        "Kees Cook" <keescook@...omium.org>,
        "Yu-cheng Yu" <yu-cheng.yu@...el.com>,
        "Dave Hansen" <dave.hansen@...ux.intel.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "Florian Weimer" <fweimer@...hat.com>,
        "Nadav Amit" <nadav.amit@...il.com>,
        "Jann Horn" <jannh@...gle.com>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        "kcc@...gle.com" <kcc@...gle.com>,
        "Borislav Petkov" <bp@...en8.de>,
        "Oleg Nesterov" <oleg@...hat.com>, "H.J. Lu" <hjl.tools@...il.com>,
        "Pavel Machek" <pavel@....cz>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "Weijiang Yang" <weijiang.yang@...el.com>,
        "Arnd Bergmann" <arnd@...db.de>,
        "Moreira, Joao" <joao.moreira@...el.com>,
        "Thomas Gleixner" <tglx@...utronix.de>,
        "Mike Kravetz" <mike.kravetz@...cle.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        "tarasmadan@...gle.com" <tarasmadan@...gle.com>,
        "Dave Martin" <Dave.Martin@....com>,
        "vedvyas.shanbhogue@...el.com" <vedvyas.shanbhogue@...el.com>,
        "Ingo Molnar" <mingo@...hat.com>,
        "Shankar, Ravi V" <ravi.v.shankar@...el.com>,
        "Jonathan Corbet" <corbet@....net>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
        "Linux API" <linux-api@...r.kernel.org>,
        "Cyrill Gorcunov" <gorcunov@...il.com>
Subject: Re: [NEEDS-REVIEW] Re: [PATCH v11 25/25] x86/cet/shstk: Add arch_prctl functions for shadow stack



On Mon, Sep 13, 2021, at 6:33 PM, Edgecombe, Rick P wrote:
> On Mon, 2020-09-14 at 11:31 -0700, Andy Lutomirski wrote:
> > > On Sep 14, 2020, at 7:50 AM, Dave Hansen <dave.hansen@...el.com>
> > > wrote:
> > > 
> > > On 9/11/20 3:59 PM, Yu-cheng Yu wrote:
> > > ...
> > > > Here are the changes if we take the mprotect(PROT_SHSTK)
> > > > approach.
> > > > Any comments/suggestions?
> > > 
> > > I still don't like it. :)
> > > 
> > > I'll also be much happier when there's a proper changelog to
> > > accompany
> > > this which also spells out the alternatives any why they suck so
> > > much.
> > > 
> > 
> > Let’s take a step back here. Ignoring the precise API, what exactly
> > is
> > a shadow stack from the perspective of a Linux user program?
> > 
> > The simplest answer is that it’s just memory that happens to have
> > certain protections.  This enables all kinds of shenanigans.  A
> > program could map a memfd twice, once as shadow stack and once as
> > non-shadow-stack, and change its control flow.  Similarly, a program
> > could mprotect its shadow stack, modify it, and mprotect it back.  In
> > some threat models, though could be seen as a WRSS bypass.  (Although
> > if an attacker can coerce a process to call mprotect(), the game is
> > likely mostly over anyway.)
> > 
> > But we could be more restrictive, or perhaps we could allow user code
> > to opt into more restrictions.  For example, we could have shadow
> > stacks be special memory that cannot be written from usermode by any
> > means other than ptrace() and friends, WRSS, and actual shadow stack
> > usage.
> > 
> > What is the goal?
> > 
> > No matter what we do, the effects of calling vfork() are going to be
> > a
> > bit odd with SHSTK enabled.  I suppose we could disallow this, but
> > that seems likely to cause its own issues.
> 
> Hi,
> 
> Resurrecting this old thread to highlight a consequence of the design
> change that came out of it. I am going to be taking over this series
> from Yu-cheng, and wanted to check if people would be interested in re-
> visiting this interface.
> 
> The consequence I wanted to highlight, is that making userspace be
> responsible for mapping memory as shadow stack, also requires moving
> the writing of the restore token to userspace for glibc ucontext
> operations. Since these operations involve creating/pivoting to new
> stacks in userspace, ucontext cet support involves also creating a new
> shadow stack. For normal thread stacks, the kernel has always done the
> shadow stack allocation and so it is never writable (in the normal
> sense) from userspace. But after this change makecontext() now first
> has to mmap() writable memory, then write the restore token, then
> mprotect() it as shadow stack. See the glibc changes to support
> PROT_SHADOW_STACK here[0].
> 
> The writable window leaves an opening for an attacker to create an
> arbitrary shadow stack that could be pivoted to later by tweaking the
> ucontext_t structure. To try to see how much this matters, we have done
> a small test that uses this window to ROP from writes in another
> thread during the makecontext()/setcontext() window. (offensive work
> credit to Joao on CC). This would require a real app to already to be
> using ucontext in the course of normal runtime.

My general opinion here (take this with a grain of salt -- I haven't paged back in every single detail) is that the kernel should make it straightforward for a libc to do the right thing without nasty races, cross-thread coordination, or unnecessary permission to write to the stack.  I *also* think that it should be possible for userspace to manage its own shadow stack allocation if it wants to, since I'm sure there will be JIT or green thread or other use cases that want to do crazy things that we fail to anticipate with in-kernel magic.

So perhaps we should keep the explicit allocation and free operations, have a way to opt-in to WRSS being flipped on, but also do our best to have API that handle the known cases well.

Does that make sense?  Can we have both approaches work in the same kernel?