lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180525144927.GE22082@lerouge>
Date:   Fri, 25 May 2018 16:49:28 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     Probir Roy <proy@...il.wm.edu>, linux-perf-users@...r.kernel.org,
        namhyung@...nel.org, alexander.shishkin@...ux.intel.com,
        acme@...nel.org, mingo@...hat.com, peterz@...radead.org,
        Andrew Lutomirski <amluto@...il.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: Perf record of mem event on kernel data address causing freeze

On Thu, May 17, 2018 at 04:38:52PM +0200, Jiri Olsa wrote:
> On Fri, May 11, 2018 at 02:23:14PM -0400, Probir Roy wrote:
> > I am using perf-tool to record memory access to some kernel addresses.
> > For some kernel addresses it freezes/lockup the system.
> > 
> > I am using kernel version 4.15.0 on x86_64 arch. I am running on an
> > Intel Broadwell machine.
> > 
> > I am using Intel's PEBS to sample kernel memory access while running a
> > micro-benchmark (performs repeated file operation) using following
> > command.
> > 
> > $ sudo perf mem -t store record
> > 
> > This records memory references. After that I run a script to set HW
> > breakpoint at the reference addresses.
> > 
> > $ sudo timeout 1s perf record -e mem:<0xaddress>:rw
> > 
> > It causes system hang at some address (for many address perf reports
> > correctly). Nothing is written in kern.log
> > 
> > 
> > I have reported it on bugzilla with detail system information:
> > https://bugzilla.kernel.org/show_bug.cgi?id=199697
> 
> I managed to reproduce.. in my case it's caused by having rw
> breakpoint on data which is touched within do_debug routine,
> and after few nested do_debug I get double fault
> 
> for example I can reproduce it immediately when setting breakpoint
> on rdtp->dynticks_nmi_nesting, which is checked in rcu_nmi_enter
> 
> I have some ugly patch so far that disables breakpoints during
> do_debug processing.. it seems to fix it on my server, could you
> try that?
> 
> thanks,
> jirka
> 
> 
> ---
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 03f3d7695dac..14d41d59abeb 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -721,9 +721,12 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
>  {
>  	struct task_struct *tsk = current;
>  	int user_icebp = 0;
> -	unsigned long dr6;
> +	unsigned long dr6, dr7;
>  	int si_code;
>  
> +	get_debugreg(dr7, 7);
> +	set_debugreg(0, 7);
> +
>  	ist_enter(regs);
>  
>  	get_debugreg(dr6, 6);
> @@ -818,6 +821,7 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
>  
>  exit:
>  	ist_exit(regs);
> +	set_debugreg(dr7, 7);
>  }
>  NOKPROBE_SYMBOL(do_debug);

I'm not sure how much we touch dr7 while in the do_debug() trap, so we may be leaking
some modifications on exit.

I think about a simple do_debug() recursion protection. The problem is where we store
that recursion flag/counter. Ideally I would prefer to have the recursion protection
before ist_enter() which already touches many key memory data (preempt_mask, rcu_data).
But if we set that before ist_enter(), we need the recursion flag to be per task because
preemption is disabled on ist_enter() only, although the comments  suggest it's unsafe
to schedule before anyway. So it could be a TIF_FLAG. But better yet, if we want to be
able to set breakpoint on thread flags, we could add a new field in thread info.

Anyway here is a very dumb version below. Can you test it Probir, to see if that's
at least the right direction?

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 03f3d76..873383b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -693,6 +693,8 @@ static bool is_sysenter_singlestep(struct pt_regs *regs)
 #endif
 }
 
+static DEFINE_PER_CPU(int, do_debug_recursion);
+
 /*
  * Our handling of the processor debug registers is non-trivial.
  * We do not clear them on entry and exit from the kernel. Therefore
@@ -725,6 +727,10 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 	int si_code;
 
 	ist_enter(regs);
+	if (__this_cpu_read(do_debug_recursion))
+		goto exit;
+
+	__this_cpu_write(do_debug_recursion, 1);
 
 	get_debugreg(dr6, 6);
 	/*
@@ -817,6 +823,7 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 	debug_stack_usage_dec();
 
 exit:
+	__this_cpu_write(do_debug_recursion, 0);
 	ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ