lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56D5BCE6.3010300@mellanox.com>
Date:	Tue, 1 Mar 2016 11:01:42 -0500
From:	Chris Metcalf <cmetcalf@...lanox.com>
To:	Daniel Thompson <daniel.thompson@...aro.org>
CC:	Russell King <linux@....linux.org.uk>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Andrew Morton <akpm@...l.org>,
	<linux-kernel@...r.kernel.org>, Aaron Tomlin <atomlin@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [PATCH 2/4] nmi_backtrace: generate one-line reports for idle
 cpus

(+PeterZ, Rafael, and Daniel Lezcano for cpuidle and scheduler)

On 03/01/2016 09:23 AM, Daniel Thompson wrote:
> On 29/02/16 21:40, Chris Metcalf wrote:
>> When doing an nmi backtrace of many cores, and most of them are idle,
>> the output is a little overwhelming and very uninformative. Suppress
>> messages for cpus that are idling when they are interrupted and
>> just emit one line, "NMI backtrace for N skipped: idle".
>
> I can see this makes the logs more attractive but this is code for 
> emergency situations.
>
> The idle task is responsible for certain power management activities. 
> How can you be sure the system is wedged because of bugs in that code?

It's a fair point, but as core count increases, you really run the risk 
of losing the valuable data in a sea of data that isn't.  For example, 
for the architecture I maintain, we have the TILE-Gx72, which is a 
72-core chip.  If each core's register dump and backtrace is 40 lines, 
we're up to around 3,000 lines of console output. Filtering that down by 
a factor of 10x or more (if only a handful of cores happen to be active, 
which is not uncommon) is a substantial usability improvement.

That said, it's true that the original solution I offered (examining 
just is_idle_task() plus interrupt nesting) is imprecise.  It is 
relatively straightforward to add a bit of per-cpu state that is set at 
the same moment we currently do stop/start_critical_timings(), which 
would indicate much more specifically that the cpu was running the 
idling code itself, and not anything more complex.  In that case if the 
flag was set, you would know you were either sitting on a 
processor-specific idle instruction in arch_cpu_idle(), or else polling 
one or two memory locations in a tight loop in cpu_idle_poll(), which 
presumably would offer sufficient precision to feel safe.

Here's an alternative version of the patch which incorporates this 
idea.  Do you think this is preferable?  Thanks!

commit 5b6dca9bad908ae66fa764025c4e6046a6cc0262
Author: Chris Metcalf <cmetcalf@...hip.com>
Date:   Mon Feb 29 11:56:32 2016 -0500

     nmi_backtrace: generate one-line reports for idle cpus
     
     When doing an nmi backtrace of many cores, and most of them are idle,
     the output is a little overwhelming and very uninformative.  Suppress
     messages for cpus that are idling when they are interrupted and
     just emit one line, "NMI backtrace for N skipped: idle".
     
     Signed-off-by: Chris Metcalf <cmetcalf@...hip.com>

diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 786ad32631a6..b8c3c4cf88ad 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -206,6 +206,7 @@ static inline int cpuidle_enter_freeze(struct cpuidle_driver *drv,
  /* kernel/sched/idle.c */
  extern void sched_idle_set_state(struct cpuidle_state *idle_state);
  extern void default_idle_call(void);
+extern bool in_cpu_idle(void);
  
  #ifdef CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED
  void cpuidle_coupled_parallel_barrier(struct cpuidle_device *dev, atomic_t *a);
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 544a7133cbd1..9aff315f278b 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -52,15 +52,25 @@ static int __init cpu_idle_nopoll_setup(char *__unused)
  __setup("hlt", cpu_idle_nopoll_setup);
  #endif
  
+static DEFINE_PER_CPU(bool, cpu_idling);
+
+/* Was the cpu was in the low-level idle code when interrupted? */
+bool in_cpu_idle(void)
+{
+	return this_cpu_read(cpu_idling);
+}
+
  static inline int cpu_idle_poll(void)
  {
  	rcu_idle_enter();
  	trace_cpu_idle_rcuidle(0, smp_processor_id());
  	local_irq_enable();
  	stop_critical_timings();
+	this_cpu_write(cpu_idling, true);
  	while (!tif_need_resched() &&
  		(cpu_idle_force_poll || tick_check_broadcast_expired()))
  		cpu_relax();
+	this_cpu_write(cpu_idling, false);
  	start_critical_timings();
  	trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
  	rcu_idle_exit();
@@ -89,7 +99,9 @@ void default_idle_call(void)
  		local_irq_enable();
  	} else {
  		stop_critical_timings();
+		this_cpu_write(cpu_idling, true);
  		arch_cpu_idle();
+		this_cpu_write(cpu_idling, false);
  		start_critical_timings();
  	}
  }
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index db63ac75eba0..75b5eacaa5d3 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -17,6 +17,7 @@
  #include <linux/kprobes.h>
  #include <linux/nmi.h>
  #include <linux/seq_buf.h>
+#include <linux/cpuidle.h>
  
  #ifdef arch_trigger_cpumask_backtrace
  /* For reliability, we're prepared to waste bits here. */
@@ -151,11 +152,16 @@ bool nmi_cpu_backtrace(struct pt_regs *regs)
  
  		/* Replace printk to write into the NMI seq */
  		this_cpu_write(printk_func, nmi_vprintk);
-		pr_warn("NMI backtrace for cpu %d\n", cpu);
-		if (regs)
-			show_regs(regs);
-		else
-			dump_stack();
+		if (in_cpu_idle()) {
+			pr_warn("NMI backtrace for cpu %d skipped: idle\n",
+				cpu);
+		} else {
+			pr_warn("NMI backtrace for cpu %d\n", cpu);
+			if (regs)
+				show_regs(regs);
+			else
+				dump_stack();
+		}
  		this_cpu_write(printk_func, printk_func_save);
  
  		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ