lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080622122845.GA10133@damson.getinternet.no>
Date:	Sun, 22 Jun 2008 14:28:45 +0200
From:	Vegard Nossum <vegard.nossum@...il.com>
To:	a.p.zijlstra@...llo.nl, arjan@...ux.intel.com
Cc:	linux-kernel@...r.kernel.org
Subject: [PATCH] softirq softlockup debugging

Hi,

I'm debugging a problem with a softirq that gets stuck for a long time,
so I wrote this patch to help find out what's going wrong.

I actually think it can be useful in general as well, see for example
http://www.kerneloops.org/search.php?search=__do_softirq&btnG=Function+Search

..and these cases are virtually impossible to debug since we don't know
anything about *what* it was that got stuck. (The NMI watchdog could
help, though.)

The patch is #ifdef-ugly, I know... Suggestions are welcome.


Vegard


From: Vegard Nossum <vegard.nossum@...il.com>
Date: Sun, 22 Jun 2008 14:12:31 +0200
Subject: [PATCH] softirq softlockup debugging

>From the Kconfig: If a softlockup happens in a softirq, the softlockup
stack trace is utterly unhelpful; it will only show the stack up to
__do_softirq(), since this is where interrupts are reenabled.

This patch adds a line to the output of the softlockup report which
contains the address of the function that was last scheduled to run in
a softirq.

Signed-off-by: Vegard Nossum <vegard.nossum@...il.com>
---
 include/linux/interrupt.h |    3 +++
 kernel/softirq.c          |   13 +++++++++++++
 kernel/softlockup.c       |    6 ++++++
 lib/Kconfig.debug         |   10 ++++++++++
 4 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index f1fc747..97d47cf 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -296,6 +296,9 @@ extern void softirq_init(void);
 extern void raise_softirq_irqoff(unsigned int nr);
 extern void raise_softirq(unsigned int nr);
 
+#ifdef CONFIG_SOFTLOCKUP_SOFTIRQ_DEBUG
+extern void *get_last_softirq_action(int cpu);
+#endif
 
 /* Tasklets --- multithreaded analogue of BHs.
 
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 36e0617..b49899a 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -196,6 +196,15 @@ void local_bh_enable_ip(unsigned long ip)
 }
 EXPORT_SYMBOL(local_bh_enable_ip);
 
+#ifdef CONFIG_SOFTLOCKUP_SOFTIRQ_DEBUG
+static DEFINE_PER_CPU(void *, last_softirq_action);
+
+void *get_last_softirq_action(int cpu)
+{
+	return per_cpu(last_softirq_action, cpu);
+}
+#endif
+
 /*
  * We restart softirq processing MAX_SOFTIRQ_RESTART times,
  * and we fall back to softirqd after that.
@@ -231,6 +240,10 @@ restart:
 
 	do {
 		if (pending & 1) {
+#ifdef CONFIG_SOFTLOCKUP_SOFTIRQ_DEBUG
+			per_cpu(last_softirq_action, cpu) = h->action;
+#endif
+
 			h->action(h);
 			rcu_bh_qsctr_inc(cpu);
 		}
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
index c828c23..2bf4fa1 100644
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -10,8 +10,10 @@
 #include <linux/cpu.h>
 #include <linux/nmi.h>
 #include <linux/init.h>
+#include <linux/interrupt.h>
 #include <linux/delay.h>
 #include <linux/freezer.h>
+#include <linux/kallsyms.h>
 #include <linux/kthread.h>
 #include <linux/notifier.h>
 #include <linux/module.h>
@@ -120,6 +122,10 @@ void softlockup_tick(void)
 	printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n",
 			this_cpu, now - touch_timestamp,
 			current->comm, task_pid_nr(current));
+#ifdef CONFIG_SOFTLOCKUP_SOFTIRQ_DEBUG
+	print_symbol(KERN_ERR "Last softirq was %s\n",
+		(unsigned long) get_last_softirq_action(this_cpu));
+#endif
 	if (regs)
 		show_regs(regs);
 	else
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index d2099f4..19a7dfc 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -159,6 +159,16 @@ config DETECT_SOFTLOCKUP
 	   can be detected via the NMI-watchdog, on platforms that
 	   support it.)
 
+config SOFTLOCKUP_SOFTIRQ_DEBUG
+	bool "Debug softirq lockups"
+	depends on DETECT_SOFTLOCKUP
+	default n
+	help
+	  If a softlockup happens in a softirq, the softlockup
+	  stack trace is utterly unhelpful; it will only show the
+	  stack up to __do_softirq(), since this is where interrupts
+	  are reenabled.
+
 config SCHED_DEBUG
 	bool "Collect scheduler debugging info"
 	depends on DEBUG_KERNEL && PROC_FS
-- 
1.5.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ