linux-kernel - Re: [PATCH 00/35] Shadow stacks for userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMe9rOoyrWXe9Zuxoat74kPW=kdjWXvbcQY=5RFu2nJACDvnOQ@mail.gmail.com>
Date:   Sun, 6 Feb 2022 05:55:24 -0800
From:   "H.J. Lu" <hjl.tools@...il.com>
To:     David Laight <David.Laight@...lab.com>
Cc:     "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
        "bsingharora@...il.com" <bsingharora@...il.com>,
        "hpa@...or.com" <hpa@...or.com>,
        "Syromiatnikov, Eugene" <esyr@...hat.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "rdunlap@...radead.org" <rdunlap@...radead.org>,
        "keescook@...omium.org" <keescook@...omium.org>,
        "Eranian, Stephane" <eranian@...gle.com>,
        "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "fweimer@...hat.com" <fweimer@...hat.com>,
        "nadav.amit@...il.com" <nadav.amit@...il.com>,
        "jannh@...gle.com" <jannh@...gle.com>,
        "kcc@...gle.com" <kcc@...gle.com>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        "pavel@....cz" <pavel@....cz>, "oleg@...hat.com" <oleg@...hat.com>,
        "Yang, Weijiang" <weijiang.yang@...el.com>,
        "Lutomirski, Andy" <luto@...nel.org>,
        "bp@...en8.de" <bp@...en8.de>, "arnd@...db.de" <arnd@...db.de>,
        "Moreira, Joao" <joao.moreira@...el.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "mike.kravetz@...cle.com" <mike.kravetz@...cle.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "Dave.Martin@....com" <Dave.Martin@....com>,
        "john.allen@....com" <john.allen@....com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "Shankar, Ravi V" <ravi.v.shankar@...el.com>,
        "corbet@....net" <corbet@....net>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
        "gorcunov@...il.com" <gorcunov@...il.com>
Subject: Re: [PATCH 00/35] Shadow stacks for userspace

On Sun, Feb 6, 2022 at 5:42 AM David Laight <David.Laight@...lab.com> wrote:
>
> From: Edgecombe, Rick P
> > Sent: 05 February 2022 20:15
> >
> > On Sat, 2022-02-05 at 05:29 -0800, H.J. Lu wrote:
> > > On Sat, Feb 5, 2022 at 5:27 AM David Laight <David.Laight@...lab.com>
> > > wrote:
> > > >
> > > > From: Edgecombe, Rick P
> > > > > Sent: 04 February 2022 01:08
> > > > > Hi Thomas,
> > > > >
> > > > > Thanks for feedback on the plan.
> > > > >
> > > > > On Thu, 2022-02-03 at 22:07 +0100, Thomas Gleixner wrote:
> > > > > > > Until now, the enabling effort was trying to support both
> > > > > > > Shadow
> > > > > > > Stack and IBT.
> > > > > > > This history will focus on a few areas of the shadow stack
> > > > > > > development history
> > > > > > > that I thought stood out.
> > > > > > >
> > > > > > >        Signals
> > > > > > >        -------
> > > > > > >        Originally signals placed the location of the shadow
> > > > > > > stack
> > > > > > > restore
> > > > > > >        token inside the saved state on the stack. This was
> > > > > > > problematic from a
> > > > > > >        past ABI promises perspective. So the restore location
> > > > > > > was
> > > > > > > instead just
> > > > > > >        assumed from the shadow stack pointer. This works
> > > > > > > because in
> > > > > > > normal
> > > > > > >        allowed cases of calling sigreturn, the shadow stack
> > > > > > > pointer
> > > > > > > should be
> > > > > > >        right at the restore token at that time. There is no
> > > > > > > alternate shadow
> > > > > > >        stack support. If an alt shadow stack is added later
> > > > > > > we
> > > > > > > would
> > > > > > >        need to
> > > > > >
> > > > > > So how is that going to work? altstack is not an esoteric
> > > > > > corner
> > > > > > case.
> > > > >
> > > > > My understanding is that the main usages for the signal stack
> > > > > were
> > > > > handling stack overflows and corruption. Since the shadow stack
> > > > > only
> > > > > contains return addresses rather than large stack allocations,
> > > > > and is
> > > > > not generally writable or pivotable, I thought there was a good
> > > > > possibility an alt shadow stack would not end up being especially
> > > > > useful. Does it seem like reasonable guesswork?
> > > >
> > > > The other 'problem' is that it is valid to longjump out of a signal
> > > > handler.
> > > > These days you have to use siglongjmp() not longjmp() but it is
> > > > still used.
> > > >
> > > > It is probably also valid to use siglongjmp() to jump from a nested
> > > > signal handler into the outer handler.
> > > > Given both signal handlers can have their own stack, there can be
> > > > three
> > > > stacks involved.
> >
> > So the scenario is?
> >
> > 1. Handle signal 1
> > 2. sigsetjmp()
> > 3. signalstack()
> > 4. Handle signal 2 on alt stack
> > 5. siglongjmp()
> >
> > I'll check that it is covered by the tests, but I think it should work
> > in this series that has no alt shadow stack. I have only done a high
> > level overview of how the shadow stack stuff, that doesn't involve the
> > kernel, works in glibc. Sounds like I'll need to do a deeper dive.
>
> The posix/xopen definition for setjmp/longjmp doesn't require such
> longjmp requests to work.
>
> Although they still have to do something that doesn't break badly.
> Aborting the process is probably fine!
>
> > > > I think the shadow stack pointer has to be in ucontext - which also
> > > > means the application can change it before returning from a signal.
> >
> > Yes we might need to change it to support alt shadow stacks. Can you
> > elaborate why you think it has to be in ucontext? I was thinking of
> > looking at three options for storing the ssp:
> >  - Stored in the shadow stack like a token using WRUSS from the kernel.
> >  - Stored on the kernel side using a hashmap that maps ucontext or
> >    sigframe userspace address to ssp (this is of course similar to
> >    storing in ucontext, except that the user can’t change the ssp).
> >  - Stored writable in userspace in ucontext.
> >
> > But in this version, without alt shadow stacks, the shadow stack
> > pointer is not stored in ucontext. This causes the limitation that
> > userspace can only call sigreturn when it has returned back to a point
> > where there is a restore token on the shadow stack (which was placed
> > there by the kernel). This doesn’t mean it can’t switch to a different
> > shadow stack or handle a nested signal, but it limits the possibility
> > for calling sigreturn with a totally different sigframe (like CRIU and
> > SROP attacks do). It should hopefully be a helpful, protective
> > limitation for most apps and I'm hoping CRIU can be fixed without
> > removing it.
> >
> > I am not aware of other limitations to signals (besides normal shadow
> > stack enforcement), but I could be missing it. And people's skepticism
> > is making me want to go back over it with more scrutiny.
> >
> > > > In much the same way as all the segment registers can be changed
> > > > leading to all the nasty bugs when the final 'return to user' code
> > > > traps in kernel when loading invalid segment registers or executing
> > > > iret.
> >
> > I don't think this is as difficult to avoid because userspace ssp has
> > its own register that should not be accessed at that point, but I have
> > not given this aspect enough analysis. Thanks for bringing it up.
>
> So the user ssp isn't saved (or restored) by the trap entry/exit.
> So it needs to be saved by the context switch code?
> Much like the user segment registers?
> So you are likely to get the same problems if restoring it can fault
> in kernel (eg for a non-canonical address).
>
> > > > Hmmm... do shadow stacks mean that longjmp() has to be a system
> > > > call?
> > >
> > > No.  setjmp/longjmp save and restore shadow stack pointer.
>
> Ok, I was thinking that direct access to the user ssp would be
> a privileged operation.

User space can only pop shadow stack.  longjmp does

#ifdef SHADOW_STACK_POINTER_OFFSET
# if IS_IN (libc) && defined SHARED && defined FEATURE_1_OFFSET
/* Check if Shadow Stack is enabled.  */
testl $X86_FEATURE_1_SHSTK, %fs:FEATURE_1_OFFSET
jz L(skip_ssp)
# else
xorl %eax, %eax
# endif
/* Check and adjust the Shadow-Stack-Pointer.  */
/* Get the current ssp.  */
rdsspq %rax
/* And compare it with the saved ssp value.  */
subq SHADOW_STACK_POINTER_OFFSET(%rdi), %rax
je L(skip_ssp)
/* Count the number of frames to adjust and adjust it
   with incssp instruction.  The instruction can adjust
   the ssp by [0..255] value only thus use a loop if
   the number of frames is bigger than 255.  */
negq %rax
shrq $3, %rax
/* NB: We saved Shadow-Stack-Pointer of setjmp.  Since we are
       restoring Shadow-Stack-Pointer of setjmp's caller, we
       need to unwind shadow stack by one more frame.  */
addq $1, %rax

movl $255, %ebx
L(loop):
cmpq %rbx, %rax
cmovb %rax, %rbx
incsspq %rbx
subq %rbx, %rax
ja L(loop)

L(skip_ssp):
#endif

> If it can be written you don't really have to worry about what code
> is trying to do - it can actually do what it likes.
> It just catches unintentional operations (like buffer overflows).
>
> Was there any 'spare' space in struct jmpbuf ?

By pure luck, we have ONE spare space in sigjmp_buf.

> Otherwise you can only enable shadow stacks if everything has been
> recompiled - including any shared libraries the might be dlopen()ed.
> (or does the compiler invent an alloca() call somehow for a
> size that comes back from glibc?)
>
> I've never really considered how setjmp/longjmp handle callee saved
> register variables (apart from it being hard).
> The original pdp11 implementation probably only needed to save r6 and r7.
>
> What does happen to all the 'extended state' that XSAVE handles?
> IIRC all the AVX registers are caller saved (so should probably
> be zerod), but some of the SSE ones are callee saved, and one or
> two of the fpu flags are sticky and annoying enough the save/restore
> at the best of times.
>
> > It sounds like it would help to write up in a lot more detail exactly
> > how all the signal and specialer stack manipulation scenarios work in
> > glibc.
>
> Some cross references might have made people notice that the ucontext
> extensions for AVX512 (if not earlier ones) broke the minimal/default
> signal stack size.
>
>         David
>

-- 
H.J.