lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YnpOJX4SNmFvCogw@FVFF77S0Q05N>
Date:   Tue, 10 May 2022 12:36:05 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Alexander Popov <alex.popov@...ux.com>
Cc:     linux-arm-kernel@...ts.infradead.org, akpm@...ux-foundation.org,
        catalin.marinas@....com, keescook@...omium.org,
        linux-kernel@...r.kernel.org, luto@...nel.org, will@...nel.org
Subject: Re: [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack()

On Sun, May 08, 2022 at 08:24:38PM +0300, Alexander Popov wrote:
> Hi Mark!
> 
> On 27.04.2022 20:31, Mark Rutland wrote:
> > Due to some historical confusion, arm64's current_top_of_stack() isn't
> > what the stackleak code expects. This could in theory result in a number
> > of problems, and practically results in an unnecessary performance hit.
> > We can avoid this by aligning the arm64 implementation with the x86
> > implementation.
> > 
> > The arm64 implementation of current_top_of_stack() was added
> > specifically for stackleak in commit:
> > 
> >    0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> > 
> > This was intended to be equivalent to the x86 implementation, but the
> > implementation, semantics, and performance characteristics differ
> > wildly:
> > 
> > * On x86, current_top_of_stack() returns the top of the current task's
> >    task stack, regardless of which stack is in active use.
> > 
> >    The implementation accesses a percpu variable which the x86 entry code
> >    maintains, and returns the location immediately above the pt_regs on
> >    the task stack (above which x86 has some padding).
> > 
> > * On arm64 current_top_of_stack() returns the top of the stack in active
> >    use (i.e. the one which is currently being used).
> > 
> >    The implementation checks the SP against a number of
> >    potentially-accessible stacks, and will BUG() if no stack is found.
> 
> As I could understand, for arm64, calling stackleak_erase() not from the
> thread stack would bring troubles because current_top_of_stack() would
> return an unexpected address from a foreign stack. Is this correct?

Yes.

> But this bug doesn't happen because arm64 always calls stackleak_erase()
> from the current thread stack. Right?

Yes.

> > The core stackleak_erase() code determines the upper bound of stack to
> > erase with:
> > 
> > | if (on_thread_stack())
> > |         boundary = current_stack_pointer;
> > | else
> > |         boundary = current_top_of_stack();
> > 
> > On arm64 stackleak_erase() is always called on a task stack, and
> > on_thread_stack() should always be true. On x86, stackleak_erase() is
> > mostly called on a trampoline stack, and is sometimes called on a task
> > stack.
> > 
> > Currently, this results in a lot of unnecessary code being generated for
> > arm64 for the impossible !on_thread_stack() case. Some of this is
> > inlined, bloating stackleak_erase(), while portions of this are left
> > out-of-line and permitted to be instrumented (which would be a
> > functional problem if that code were reachable).
> 
> Sorry, I didn't understand this part about instrumentation. Could you
> elaborate please?

Portions of the code are regular .text, and are subject to instrumentation by
KASAN/UBSAN/KCOV, ftrace, etc, where the compiler will (implicitly) insert
calls to out-of-line instrumentation callbacks. Some (but not all) of those are
disabled in the Makefile. For example, ftrace instrumentation is possible.

Generally, the instrumentation callbacks expect to run with a full kernel
environment (e.g. with RCU watching, IRQ tracing state correct), but at the
time stackleak_erase() is called, this is not the case. so those could go wrong
and corrupt state.

Additionally, since those calls are added implicitly by the compiler, they can
manipulate state at arbitrary points throughout the function where we might not
expect it (e.g. changing current->lowest_stack).

The general stance is that we should use noinstr to disable instrumentation,
and anything that needs to be inlined into noinstr needs to be marked with
__always_inline (which is guaranteed to either inline or cause a compile-time
error if it is not possible to inline).

This patch reworks things to avoid the potential problems; as per the commit
message I don't beleive anything goes wrong in practice today.

Thanks,
Mark.

> > As a first step towards improving this, this patch aligns arm64's
> > implementation of current_top_of_stack() with x86's, always returning
> > the top of the current task's stack. With GCC 11.1.0 this results in the
> > bulk of the unnecessary code being removed, including all of the
> > out-of-line instrumentable code.
> > 
> > While I don't believe there's a functional problem in practice I've
> > marked this as a fix since the semantic was clearly wrong, the fix
> > itself is simple, and other code might rely upon this in future.
> > 
> > Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> > Signed-off-by: Mark Rutland <mark.rutland@....com>
> > Cc: Alexander Popov <alex.popov@...ux.com>
> > Cc: Andrew Morton <akpm@...ux-foundation.org>
> > Cc: Andy Lutomirski <luto@...nel.org>
> > Cc: Catalin Marinas <catalin.marinas@....com>
> > Cc: Kees Cook <keescook@...omium.org>
> > Cc: Will Deacon <will@...nel.org>
> > ---
> >   arch/arm64/include/asm/processor.h | 10 ++++------
> >   1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> > index 73e38d9a540ce..6b1a12c23fe77 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -381,12 +381,10 @@ long get_tagged_addr_ctrl(struct task_struct *task);
> >    * of header definitions for the use of task_stack_page.
> >    */
> > -#define current_top_of_stack()								\
> > -({											\
> > -	struct stack_info _info;							\
> > -	BUG_ON(!on_accessible_stack(current, current_stack_pointer, 1, &_info));	\
> > -	_info.high;									\
> > -})
> > +/*
> > + * The top of the current task's task stack
> > + */
> > +#define current_top_of_stack()	((unsigned long)current->stack + THREAD_SIZE)
> >   #define on_thread_stack()	(on_task_stack(current, current_stack_pointer, 1, NULL))
> >   #endif /* __ASSEMBLY__ */
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ