[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202003281520.A9BFF461@keescook>
Date: Sat, 28 Mar 2020 15:26:42 -0700
From: Kees Cook <keescook@...omium.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Elena Reshetova <elena.reshetova@...el.com>, x86@...nel.org,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Potapenko <glider@...gle.com>,
Ard Biesheuvel <ard.biesheuvel@...aro.org>,
Jann Horn <jannh@...gle.com>,
"Perla, Enrico" <enrico.perla@...el.com>,
kernel-hardening@...ts.openwall.com,
linux-arm-kernel@...ts.infradead.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 4/5] x86/entry: Enable random_kstack_offset support
On Tue, Mar 24, 2020 at 01:32:30PM -0700, Kees Cook wrote:
> Allow for a randomized stack offset on a per-syscall basis, with roughly
> 5 bits of entropy.
>
> Signed-off-by: Kees Cook <keescook@...omium.org>
> ---
> arch/x86/Kconfig | 1 +
> arch/x86/entry/common.c | 12 +++++++++++-
> 2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index beea77046f9b..b9d449581eb6 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -150,6 +150,7 @@ config X86
> select HAVE_ARCH_TRANSPARENT_HUGEPAGE
> select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
> select HAVE_ARCH_VMAP_STACK if X86_64
> + select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> select HAVE_ARCH_WITHIN_STACK_FRAMES
> select HAVE_ASM_MODVERSIONS
> select HAVE_CMPXCHG_DOUBLE
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 9747876980b5..086d7af570af 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -26,6 +26,7 @@
> #include <linux/livepatch.h>
> #include <linux/syscalls.h>
> #include <linux/uaccess.h>
> +#include <linux/randomize_kstack.h>
>
> #include <asm/desc.h>
> #include <asm/traps.h>
> @@ -189,6 +190,13 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
> lockdep_assert_irqs_disabled();
> lockdep_sys_exit();
>
> + /*
> + * x86_64 stack alignment means 3 bits are ignored, so keep
> + * the top 5 bits. x86_32 needs only 2 bits of alignment, so
> + * the top 6 bits will be used.
> + */
> + choose_random_kstack_offset(rdtsc() & 0xFF);
> +
> cached_flags = READ_ONCE(ti->flags);
>
> if (unlikely(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS))
> @@ -283,6 +291,7 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
> {
> struct thread_info *ti;
>
> + add_random_kstack_offset();
> enter_from_user_mode();
> local_irq_enable();
> ti = current_thread_info();
So, I got an email from 0day that this caused a performance regression[1]
with things _turned off_. On closer examination:
Before (objdump -dS vmlinux):
__visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
{
ffffffff81003920: 41 54 push %r12
ffffffff81003922: 55 push %rbp
ffffffff81003923: 48 89 f5 mov %rsi,%rbp
ffffffff81003926: 53 push %rbx
ffffffff81003927: 48 89 fb mov %rdi,%rbx
struct thread_info *ti;
enter_from_user_mode();
local_irq_enable();
...
After:
__visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
{
ffffffff81003960: 55 push %rbp
ffffffff81003961: 48 89 e5 mov %rsp,%rbp
ffffffff81003964: 41 55 push %r13
ffffffff81003966: 41 54 push %r12
ffffffff81003968: 49 89 f4 mov %rsi,%r12
ffffffff8100396b: 53 push %rbx
ffffffff8100396c: 48 89 fb mov %rdi,%rbx
ffffffff8100396f: 48 83 ec 08 sub $0x8,%rsp
ffffffff81003973: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
ffffffff8100397a: 00 00
ffffffff8100397c: 48 89 45 e0 mov %rax,-0x20(%rbp)
ffffffff81003980: 31 c0 xor %eax,%eax
asm_volatile_goto("1:"
ffffffff81003982: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
struct thread_info *ti;
add_random_kstack_offset();
enter_from_user_mode();
local_irq_enable();
The "nopl" there is the static_branch code that I'd expect. However, the
preample is quite different. *drum roll* Anyone else recognize %gs:0x28?
That's the stack canary. :P It seems that GCC views this as an array:
char *ptr = __builtin_alloca(offset & 0x3FF);
asm volatile("" : "=m"(*ptr));
because it's locally allocated on the stack. *face palm*
I'll go figure out a way to fix this...
-Kees
[1] https://lore.kernel.org/lkml/202003281505.0F481D3@keescook/
--
Kees Cook
Powered by blists - more mailing lists