[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20121119161917.229c6d6f.akpm@linux-foundation.org>
Date: Mon, 19 Nov 2012 16:19:17 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Sasha Levin <sasha.levin@...cle.com>
Cc: mingo@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] watchdog: Inject NMI when locked up and going to panic
On Sat, 17 Nov 2012 19:28:53 -0500
Sasha Levin <sasha.levin@...cle.com> wrote:
> Send an NMI to all CPUs when a lockup is detected and the lockup
> watchdog code is configured to panic. This gives us a fairly uptodate
> snapshot of all CPUs in the system.
>
> This lets us get stack trace of all CPUs which makes life easier
> trying to debug a deadlock, and the NMI doesn't change anything
> since the next step is a kernel panic.
>
nit: I'll rename this to "watchdog: trigger all-cpu backtrace when
locked up and going to panic". We don't know how the arch implements
trigger_all_cpu_backtrace() at this level!
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -239,10 +239,12 @@ static void watchdog_overflow_callback(struct perf_event *event,
> if (__this_cpu_read(hard_watchdog_warn) == true)
> return;
>
> - if (hardlockup_panic)
> + if (hardlockup_panic) {
> + trigger_all_cpu_backtrace();
> panic("Watchdog detected hard LOCKUP on cpu %d", this_cpu);
> - else
> + } else {
> WARN(1, "Watchdog detected hard LOCKUP on cpu %d", this_cpu);
> + }
>
> __this_cpu_write(hard_watchdog_warn, true);
> return;
> @@ -323,8 +325,10 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> else
> dump_stack();
>
> - if (softlockup_panic)
> + if (softlockup_panic) {
> + trigger_all_cpu_backtrace();
> panic("softlockup: hung tasks");
> + }
> __this_cpu_write(soft_watchdog_warn, true);
> } else
> __this_cpu_write(soft_watchdog_warn, false);
The change seems sensible, but I wonder about CONFIG_SMP=n machines.
Will they end up getting the same backtrace displayed twice?
(I don't remember whether trigger_all_cpu_backtrace() is really
trigger_all_other_cpu_backtrace() and we didn't document it).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists