lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 4 Feb 2021 17:46:49 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Lai Jiangshan <jiangshanlai@...il.com>
Cc:     linux-kernel@...r.kernel.org,
        Lai Jiangshan <laijs@...ux.alibaba.com>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Joerg Roedel <jroedel@...e.de>,
        Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
        Reinette Chatre <reinette.chatre@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Gabriel Krisman Bertazi <krisman@...labora.com>,
        Kees Cook <keescook@...omium.org>,
        Frederic Weisbecker <frederic@...nel.org>,
        Jens Axboe <axboe@...nel.dk>,
        Arvind Sankar <nivedita@...m.mit.edu>,
        Brian Gerst <brgerst@...il.com>,
        Ard Biesheuvel <ardb@...nel.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Mike Rapoport <rppt@...nel.org>, Mike Hommey <mh@...ndium.org>,
        Mark Gross <mgross@...ux.intel.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Anthony Steinhauser <asteinhauser@...gle.com>,
        Jay Lang <jaytlang@....edu>,
        "Chang S. Bae" <chang.seok.bae@...el.com>
Subject: Re: [PATCH V3 1/6] x86_64: move cpu_current_top_of_stack out of TSS

> Subject: Re: [PATCH V3 1/6] x86_64: move cpu_current_top_of_stack out of TSS

The tip tree preferred format for patch subject prefixes is
'subsys/component:', e.g. 'x86/apic:', 'x86/mm/fault:', 'sched/fair:',
'genirq/core:'. Please do not use file names or complete file paths as
prefix. 'git log path/to/file' should give you a reasonable hint in most
cases.

The condensed patch description in the subject line should start with a
uppercase letter and should be written in imperative tone.

Fix all your subjects pls.

On Thu, Jan 28, 2021 at 12:32:17AM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@...ux.alibaba.com>
> 
> When X86_BUG_CPU_MELTDOWN & KPTI,

Please write that out properly.

> cpu_current_top_of_stack lives in the
> TSS which is also in the user CR3 and it becomes a coveted fruit.  An
> attacker can fetch the kernel stack top from it and continue next steps
> of actions based on the kernel stack.
> 
> The address might not be very usefull for attacker,

You just said it is a "coveted fruit" and now it is not very useful?!
Which is it?

Also, use spellchecker pls.

> but it is not so
> necessary to be in TSS either.  It is only accessed when CR3 is kernel CR3
> and gs_base is kernel gs_base

You mean when at CPL0?

> which means it can be in any percpu variable.
> 
> The major reason it is in TSS might be performance because it is hot in

"might be"?

> cache and tlb since we just access sp2 as the scratch space in syscall.

"TLB"

You can use the comment text directly as it is more precise:

"entry_SYSCALL_64 uses it as scratch space to stash the user RSP value."

> 
> So we can move it to a percpu variable near other hot percpu variables,

Who's "we" ?

Also, from Documentation/process/submitting-patches.rst:

 "Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
  instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
  to do frotz", as if you are giving orders to the codebase to change
  its behaviour."

> such as current_task, __preempt_count, and they are in the same
> cache line.
> 
> tools/testing/selftests/seccomp/seccomp_benchmark desn't show any
> performance lost in "getpid native" result.  And actually, the result
> changes from 93ns before patch to 92ns after patch when !KPTI, and the
> test is very stable although the test desn't show a higher degree of
> precision but enough to know it doesn't cause degression for the test.

This paragraph belongs ...

> Signed-off-by: Lai Jiangshan <laijs@...ux.alibaba.com>
> ---

....

<--- here.

>  arch/x86/include/asm/processor.h   | 10 ----------
>  arch/x86/include/asm/switch_to.h   |  7 +------
>  arch/x86/include/asm/thread_info.h |  6 ------
>  arch/x86/kernel/cpu/common.c       |  3 +++
>  arch/x86/kernel/process.c          |  7 +------
>  arch/x86/mm/pti.c                  |  7 +++----
>  6 files changed, 8 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index c20a52b5534b..886d32da1318 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -314,11 +314,6 @@ struct x86_hw_tss {
>  struct x86_hw_tss {
>  	u32			reserved1;
>  	u64			sp0;
> -
> -	/*
> -	 * We store cpu_current_top_of_stack in sp1 so it's always accessible.
> -	 * Linux does not use ring 1, so sp1 is not otherwise needed.
> -	 */
>  	u64			sp1;
>  
>  	/*
> @@ -428,12 +423,7 @@ struct irq_stack {
>  
>  DECLARE_PER_CPU(struct irq_stack *, hardirq_stack_ptr);
>  
> -#ifdef CONFIG_X86_32
>  DECLARE_PER_CPU(unsigned long, cpu_current_top_of_stack);
> -#else
> -/* The RO copy can't be accessed with this_cpu_xyz(), so use the RW copy. */
> -#define cpu_current_top_of_stack cpu_tss_rw.x86_tss.sp1
> -#endif
>  
>  #ifdef CONFIG_X86_64
>  struct fixed_percpu_data {
> diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
> index 9f69cc497f4b..b5f0d2ff47e4 100644
> --- a/arch/x86/include/asm/switch_to.h
> +++ b/arch/x86/include/asm/switch_to.h
> @@ -71,12 +71,7 @@ static inline void update_task_stack(struct task_struct *task)
>  	else
>  		this_cpu_write(cpu_tss_rw.x86_tss.sp1, task->thread.sp0);
>  #else
> -	/*
> -	 * x86-64 updates x86_tss.sp1 via cpu_current_top_of_stack. That
> -	 * doesn't work on x86-32 because sp1 and
> -	 * cpu_current_top_of_stack have different values (because of
> -	 * the non-zero stack-padding on 32bit).
> -	 */
> +	/* Xen PV enters the kernel on the thread stack. */

What's that comment here for?

>  	if (static_cpu_has(X86_FEATURE_XENPV))
>  		load_sp0(task_top_of_stack(task));
>  #endif
> diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
> index 0d751d5da702..3dc93d8df425 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -197,12 +197,6 @@ static inline int arch_within_stack_frames(const void * const stack,
>  #endif
>  }
>  
> -#else /* !__ASSEMBLY__ */
> -
> -#ifdef CONFIG_X86_64
> -# define cpu_current_top_of_stack (cpu_tss_rw + TSS_sp1)
> -#endif
> -
>  #endif
>  
>  #ifdef CONFIG_COMPAT
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 35ad8480c464..f3d7fd7e9684 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1745,6 +1745,9 @@ DEFINE_PER_CPU(unsigned int, irq_count) __visible = -1;
>  DEFINE_PER_CPU(int, __preempt_count) = INIT_PREEMPT_COUNT;
>  EXPORT_PER_CPU_SYMBOL(__preempt_count);
>  
> +DEFINE_PER_CPU(unsigned long, cpu_current_top_of_stack) = TOP_OF_INIT_STACK;
> +EXPORT_PER_CPU_SYMBOL(cpu_current_top_of_stack);

... _GPL, of course.

Or lemme rephrase: who's going to cry if this export is _GPL?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ