lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 17 Apr 2019 09:52:43 +0800
From:   He Zhe <zhe.he@...driver.com>
To:     Steven Rostedt <rostedt@...dmis.org>,
        LKML <linux-kernel@...r.kernel.org>
CC:     Peter Zijlstra <peterz@...radead.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
        Andy Lutomirski <luto@...capital.net>,
        Joel Fernandes <joel@...lfernandes.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC][PATCH] tracing/x86: Save CR2 before tracing irqsoff on
 error_entry

>From the last replies in the thread, it seems some work is going on. May I ask
when we can possibly roughly have a fix or workaround?

Thanks,
Zhe


On 3/21/19 10:15 AM, Steven Rostedt wrote:
> From: Steven Rostedt (VMware) <rostedt@...dmis.org>
>
> He Zhe reported a crash by enabling trace events and selecting
> "userstacktrace" which will read the stack of userspace for every trace
> event recorded. Zhe narrowed it down to:
>
>  c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")
>
> With the problem config, I was able to also reproduce the error. I
> narrowed it down to just having to do the following:
>
>  # cd /sys/kernel/tracing
>  # echo 1 > options/userstacktrace
>  # echo 1 > events/preemptirq/irq_disable/enable
>
> And sure enough, I triggered a crash. Well, it was systemd crashing
> with a bad memory access??
>
>  systemd-journal[537]: segfault at ed8cb8 ip 00007f7fffc9fef5 sp 00007ffc4062cb10 error 7
>
> And it would crash similarly each time I tried it, but always at a
> different place. After spending the day on this, I finally figured it
> out. The bug is happening in entry_64.S right after error_entry.
> There's two TRACE_IRQS_OFF in that code path, which if I comment out,
> the bug goes away. Then it dawned on me that the crash always happens
> when systemd does a normal page fault. We had this bug before, and it
> was with the exception trace points.
>
> The issue is that a tracepoint can fault (reading vmalloc or whatever).
> And doing a userspace stack trace most definitely will fault. But if we
> are coming from a legitimate page fault, the address of that fault (in
> the CR2 register) will be lost if we fault before we get to the page
> fault handler. That's exactly what is happening.
>
> To solve this, a TRACE_IRQS_OFF_CR2 (and ON for consistency) was added
> that saves the CR2 register. A new trace_hardirqs_off_thunk_cr2 is
> created that stores the cr2 register, calls the
> trace_hardirqs_off_caller, then on return restores the cr2 register if
> it changed, before returning.
>
> On my tests this fixes the issue. I just want to know if this is a
> legitimate fix or if someone can come up with a better fix?
>
> Note: this also saves the exception context just like the
> do_page_fault() function does.
>
> Note2: This only gets enabled when lockdep or irq tracing is enabled,
> which is not recommended for production environments.
>
> Link: http://lkml.kernel.org/r/897cf5cf-fc24-8a64-cb28-847f2d2e63d2@windriver.com
>
> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")
> Signed-off-by: Steven Rostedt (VMware) <rostedt@...dmis.org>
> ---
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 7bc105f47d21..7edffec328e2 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -292,6 +292,32 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
>  
>  	syscall_return_slowpath(regs);
>  }
> +
> +extern void trace_hardirqs_on_caller(unsigned long caller_addr);
> +__visible void trace_hardirqs_on_caller_cr2(unsigned long caller_addr)
> +{
> +	unsigned long address = read_cr2(); /* Get the faulting address */
> +	enum ctx_state prev_state;
> +
> +	prev_state = exception_enter();
> +	trace_hardirqs_on_caller(caller_addr);
> +	if (address != read_cr2())
> +		write_cr2(address);
> +	exception_exit(prev_state);
> +}
> +
> +extern void trace_hardirqs_off_caller(unsigned long caller_addr);
> +__visible void trace_hardirqs_off_caller_cr2(unsigned long caller_addr)
> +{
> +	unsigned long address = read_cr2(); /* Get the faulting address */
> +	enum ctx_state prev_state;
> +
> +	prev_state = exception_enter();
> +	trace_hardirqs_off_caller(caller_addr);
> +	if (address != read_cr2())
> +		write_cr2(address);
> +	exception_exit(prev_state);
> +}
>  #endif
>  
>  #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 1f0efdb7b629..73ddf24fed3e 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1248,12 +1248,12 @@ ENTRY(error_entry)
>  	 * we fix gsbase, and we should do it before enter_from_user_mode
>  	 * (which can take locks).
>  	 */
> -	TRACE_IRQS_OFF
> +	TRACE_IRQS_OFF_CR2
>  	CALL_enter_from_user_mode
>  	ret
>  
>  .Lerror_entry_done:
> -	TRACE_IRQS_OFF
> +	TRACE_IRQS_OFF_CR2
>  	ret
>  
>  	/*
> diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
> index be36bf4e0957..1300b53b98cb 100644
> --- a/arch/x86/entry/thunk_64.S
> +++ b/arch/x86/entry/thunk_64.S
> @@ -41,6 +41,8 @@
>  #ifdef CONFIG_TRACE_IRQFLAGS
>  	THUNK trace_hardirqs_on_thunk,trace_hardirqs_on_caller,1
>  	THUNK trace_hardirqs_off_thunk,trace_hardirqs_off_caller,1
> +	THUNK trace_hardirqs_on_thunk_cr2,trace_hardirqs_on_caller_cr2,1
> +	THUNK trace_hardirqs_off_thunk_cr2,trace_hardirqs_off_caller_cr2,1
>  #endif
>  
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
> diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
> index 058e40fed167..dd511742ca2e 100644
> --- a/arch/x86/include/asm/irqflags.h
> +++ b/arch/x86/include/asm/irqflags.h
> @@ -172,9 +172,13 @@ static inline int arch_irqs_disabled(void)
>  #ifdef CONFIG_TRACE_IRQFLAGS
>  #  define TRACE_IRQS_ON		call trace_hardirqs_on_thunk;
>  #  define TRACE_IRQS_OFF	call trace_hardirqs_off_thunk;
> +#  define TRACE_IRQS_ON_CR2	call trace_hardirqs_on_thunk_cr2;
> +#  define TRACE_IRQS_OFF_CR2	call trace_hardirqs_off_thunk_cr2;
>  #else
>  #  define TRACE_IRQS_ON
>  #  define TRACE_IRQS_OFF
> +#  define TRACE_IRQS_ON_CR2
> +#  define TRACE_IRQS_OFF_CR2
>  #endif
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  #  ifdef CONFIG_X86_64
>

Powered by blists - more mailing lists