[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMe9rOqwe8Mr2pkf0yopWj_F7yZLj9_nmz97+AmFkkmd2U=-fg@mail.gmail.com>
Date: Wed, 21 Jul 2021 13:14:37 -0700
From: "H.J. Lu" <hjl.tools@...il.com>
To: John Allen <john.allen@....com>
Cc: Yu-cheng Yu <yu-cheng.yu@...el.com>,
"the arch/x86 maintainers" <x86@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
linux-arch <linux-arch@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
Arnd Bergmann <arnd@...db.de>,
Andy Lutomirski <luto@...nel.org>,
Balbir Singh <bsingharora@...il.com>,
Borislav Petkov <bp@...en8.de>,
Cyrill Gorcunov <gorcunov@...il.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Eugene Syromiatnikov <esyr@...hat.com>,
Florian Weimer <fweimer@...hat.com>,
Jann Horn <jannh@...gle.com>, Jonathan Corbet <corbet@....net>,
Kees Cook <keescook@...omium.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Nadav Amit <nadav.amit@...il.com>,
Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
Peter Zijlstra <peterz@...radead.org>,
Randy Dunlap <rdunlap@...radead.org>,
"Ravi V. Shankar" <ravi.v.shankar@...el.com>,
Vedvyas Shanbhogue <vedvyas.shanbhogue@...el.com>,
Dave Martin <Dave.Martin@....com>,
Weijiang Yang <weijiang.yang@...el.com>,
Pengfei Xu <pengfei.xu@...el.com>,
Haitao Huang <haitao.huang@...el.com>
Subject: Re: [PATCH v27 24/31] x86/cet/shstk: Handle thread shadow stack
On Wed, Jul 21, 2021 at 11:15 AM John Allen <john.allen@....com> wrote:
>
> On Fri, May 21, 2021 at 03:12:04PM -0700, Yu-cheng Yu wrote:
> > diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
> > index 5ea2b494e9f9..8e5f772181b9 100644
> > --- a/arch/x86/kernel/shstk.c
> > +++ b/arch/x86/kernel/shstk.c
> > @@ -71,6 +71,53 @@ int shstk_setup(void)
> > return 0;
> > }
> >
> > +int shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long clone_flags,
> > + unsigned long stack_size)
> > +{
> > + struct thread_shstk *shstk = &tsk->thread.shstk;
> > + struct cet_user_state *state;
> > + unsigned long addr;
> > +
> > + if (!stack_size)
> > + return -EINVAL;
>
> I've been doing some light testing on AMD hardware and I've found that
> this version of the patchset doesn't boot for me. It appears that when
> systemd processes start spawning, they hit the above case, return
> -EINVAL, and the fork fails. In these cases, copy_thread has been passed
> 0 for both sp and stack_size.
>
> For previous versions of the patchset, I can still boot. When the
> stack_size check was last, the function would always return before
> completing the check, hitting one of the two cases below.
>
> At the very least, it would seem that on some systems, it isn't valid to
> rely on the stack_size passed from clone3, though I'm unsure what the
> correct behavior should be here. If the passed stack_size == 0 and sp ==
> 0, is this a case where we want to alloc a shadow stack for this thread
> with some capped size? Alternatively, is this a case that isn't valid to
> alloc a shadow stack and we should simply return 0 instead of -EINVAL?
>
> I'm running Fedora 34 which satisfies the required versions of gcc,
> binutils, and glibc.
>
> Please let me know if there is any additional information I can provide.
FWIW, I have been maintaining stable CET kernels at:
https://github.com/hjl-tools/linux/
The current CET kernel is on hjl/cet/linux-5.13.y branch.
> Thanks,
> John
>
> > +
> > + if (!shstk->size)
> > + return 0;
> > +
> > + /*
> > + * For CLONE_VM, except vfork, the child needs a separate shadow
> > + * stack.
> > + */
> > + if ((clone_flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM)
> > + return 0;
> > +
> > + state = get_xsave_addr(&tsk->thread.fpu.state.xsave, XFEATURE_CET_USER);
> > + if (!state)
> > + return -EINVAL;
> > +
> > + /*
> > + * Compat-mode pthreads share a limited address space.
> > + * If each function call takes an average of four slots
> > + * stack space, allocate 1/4 of stack size for shadow stack.
> > + */
> > + if (in_compat_syscall())
> > + stack_size /= 4;
> > +
> > + stack_size = round_up(stack_size, PAGE_SIZE);
> > + addr = alloc_shstk(stack_size);
> > + if (IS_ERR_VALUE(addr)) {
> > + shstk->base = 0;
> > + shstk->size = 0;
> > + return PTR_ERR((void *)addr);
> > + }
> > +
> > + fpu__prepare_write(&tsk->thread.fpu);
> > + state->user_ssp = (u64)(addr + stack_size);
> > + shstk->base = addr;
> > + shstk->size = stack_size;
> > + return 0;
> > +}
> > +
> > void shstk_free(struct task_struct *tsk)
> > {
> > struct thread_shstk *shstk = &tsk->thread.shstk;
> > @@ -80,7 +127,13 @@ void shstk_free(struct task_struct *tsk)
> > !shstk->base)
> > return;
> >
> > - if (!tsk->mm)
> > + /*
> > + * When fork() with CLONE_VM fails, the child (tsk) already has a
> > + * shadow stack allocated, and exit_thread() calls this function to
> > + * free it. In this case the parent (current) and the child share
> > + * the same mm struct.
> > + */
> > + if (!tsk->mm || tsk->mm != current->mm)
> > return;
> >
> > while (1) {
--
H.J.
Powered by blists - more mailing lists