[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56DD3B3D.1010404@linaro.org>
Date: Mon, 7 Mar 2016 15:26:37 +0700
From: Daniel Thompson <daniel.thompson@...aro.org>
To: Chris Metcalf <cmetcalf@...lanox.com>
Cc: Russell King <linux@....linux.org.uk>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Andrew Morton <akpm@...l.org>,
linux-kernel@...r.kernel.org, Aaron Tomlin <atomlin@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [PATCH 2/4] nmi_backtrace: generate one-line reports for idle
cpus
On 01/03/16 23:01, Chris Metcalf wrote:
> (+PeterZ, Rafael, and Daniel Lezcano for cpuidle and scheduler)
>
> On 03/01/2016 09:23 AM, Daniel Thompson wrote:
>> On 29/02/16 21:40, Chris Metcalf wrote:
>>> When doing an nmi backtrace of many cores, and most of them are idle,
>>> the output is a little overwhelming and very uninformative. Suppress
>>> messages for cpus that are idling when they are interrupted and
>>> just emit one line, "NMI backtrace for N skipped: idle".
>>
>> I can see this makes the logs more attractive but this is code for
>> emergency situations.
>>
>> The idle task is responsible for certain power management activities.
>> How can you be sure the system is wedged because of bugs in that code?
>
> It's a fair point, but as core count increases, you really run the risk
> of losing the valuable data in a sea of data that isn't. For example,
> for the architecture I maintain, we have the TILE-Gx72, which is a
> 72-core chip. If each core's register dump and backtrace is 40 lines,
> we're up to around 3,000 lines of console output. Filtering that down by
> a factor of 10x or more (if only a handful of cores happen to be active,
> which is not uncommon) is a substantial usability improvement.
No objections to your use case. The output feels very verbose even with
"only" eight cores.
> That said, it's true that the original solution I offered (examining
> just is_idle_task() plus interrupt nesting) is imprecise. It is
> relatively straightforward to add a bit of per-cpu state that is set at
> the same moment we currently do stop/start_critical_timings(), which
> would indicate much more specifically that the cpu was running the
> idling code itself, and not anything more complex. In that case if the
> flag was set, you would know you were either sitting on a
> processor-specific idle instruction in arch_cpu_idle(), or else polling
> one or two memory locations in a tight loop in cpu_idle_poll(), which
> presumably would offer sufficient precision to feel safe.
>
> Here's an alternative version of the patch which incorporates this
> idea. Do you think this is preferable? Thanks!
I prefer the approach taken by the new patch although I think the
implementation might be buggy...
> commit 5b6dca9bad908ae66fa764025c4e6046a6cc0262
> Author: Chris Metcalf <cmetcalf@...hip.com>
> Date: Mon Feb 29 11:56:32 2016 -0500
>
> nmi_backtrace: generate one-line reports for idle cpus
> When doing an nmi backtrace of many cores, and most of them are idle,
> the output is a little overwhelming and very uninformative. Suppress
> messages for cpus that are idling when they are interrupted and
> just emit one line, "NMI backtrace for N skipped: idle".
> Signed-off-by: Chris Metcalf <cmetcalf@...hip.com>
>
> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> index 786ad32631a6..b8c3c4cf88ad 100644
> --- a/include/linux/cpuidle.h
> +++ b/include/linux/cpuidle.h
> @@ -206,6 +206,7 @@ static inline int cpuidle_enter_freeze(struct
> cpuidle_driver *drv,
> /* kernel/sched/idle.c */
> extern void sched_idle_set_state(struct cpuidle_state *idle_state);
> extern void default_idle_call(void);
> +extern bool in_cpu_idle(void);
>
> #ifdef CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED
> void cpuidle_coupled_parallel_barrier(struct cpuidle_device *dev,
> atomic_t *a);
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 544a7133cbd1..9aff315f278b 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -52,15 +52,25 @@ static int __init cpu_idle_nopoll_setup(char *__unused)
> __setup("hlt", cpu_idle_nopoll_setup);
> #endif
>
> +static DEFINE_PER_CPU(bool, cpu_idling);
> +
> +/* Was the cpu was in the low-level idle code when interrupted? */
> +bool in_cpu_idle(void)
> +{
> + return this_cpu_read(cpu_idling);
I think we continue to need the code to identify a core that is running
an interrupt handler. Interrupts are not masked at the point we set
cpu_idling to false meaning we can easily be preempted before we clear
the flag.
> +}
> +
> static inline int cpu_idle_poll(void)
> {
> rcu_idle_enter();
> trace_cpu_idle_rcuidle(0, smp_processor_id());
> local_irq_enable();
> stop_critical_timings();
> + this_cpu_write(cpu_idling, true);
> while (!tif_need_resched() &&
> (cpu_idle_force_poll || tick_check_broadcast_expired()))
> cpu_relax();
> + this_cpu_write(cpu_idling, false);
> start_critical_timings();
> trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
> rcu_idle_exit();
> @@ -89,7 +99,9 @@ void default_idle_call(void)
> local_irq_enable();
> } else {
> stop_critical_timings();
> + this_cpu_write(cpu_idling, true);
> arch_cpu_idle();
> + this_cpu_write(cpu_idling, false);
> start_critical_timings();
> }
> }
> diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
> index db63ac75eba0..75b5eacaa5d3 100644
> --- a/lib/nmi_backtrace.c
> +++ b/lib/nmi_backtrace.c
> @@ -17,6 +17,7 @@
> #include <linux/kprobes.h>
> #include <linux/nmi.h>
> #include <linux/seq_buf.h>
> +#include <linux/cpuidle.h>
>
> #ifdef arch_trigger_cpumask_backtrace
> /* For reliability, we're prepared to waste bits here. */
> @@ -151,11 +152,16 @@ bool nmi_cpu_backtrace(struct pt_regs *regs)
>
> /* Replace printk to write into the NMI seq */
> this_cpu_write(printk_func, nmi_vprintk);
> - pr_warn("NMI backtrace for cpu %d\n", cpu);
> - if (regs)
> - show_regs(regs);
> - else
> - dump_stack();
> + if (in_cpu_idle()) {
> + pr_warn("NMI backtrace for cpu %d skipped: idle\n",
> + cpu);
> + } else {
> + pr_warn("NMI backtrace for cpu %d\n", cpu);
> + if (regs)
> + show_regs(regs);
> + else
> + dump_stack();
> + }
> this_cpu_write(printk_func, printk_func_save);
>
> cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
>
Powered by blists - more mailing lists