[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tip-3003eba313dd0e0502dd71548c36fe7c19801ce5@git.kernel.org>
Date: Fri, 22 Apr 2011 12:19:37 GMT
From: tip-bot for Steven Rostedt <srostedt@...hat.com>
To: linux-tip-commits@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, hpa@...or.com, mingo@...hat.com,
torvalds@...ux-foundation.org, a.p.zijlstra@...llo.nl,
fweisbec@...il.com, akpm@...ux-foundation.org, rostedt@...dmis.org,
srostedt@...hat.com, tglx@...utronix.de, mingo@...e.hu
Subject: [tip:core/locking] lockdep: Print a nicer description for irq lock inversions
Commit-ID: 3003eba313dd0e0502dd71548c36fe7c19801ce5
Gitweb: http://git.kernel.org/tip/3003eba313dd0e0502dd71548c36fe7c19801ce5
Author: Steven Rostedt <srostedt@...hat.com>
AuthorDate: Wed, 20 Apr 2011 21:41:54 -0400
Committer: Ingo Molnar <mingo@...e.hu>
CommitDate: Fri, 22 Apr 2011 11:06:57 +0200
lockdep: Print a nicer description for irq lock inversions
Locking order inversion due to interrupts is a subtle problem.
When an irq lockiinversion discovered by lockdep it currently
reports something like:
[ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
... and then prints out the locks that are involved, as back traces.
Judging by lkml feedback developers were routinely confused by what
a HARDIRQ->safe to unsafe issue is all about, and sometimes even
blew it off as a bug in lockdep.
It is not obvious when lockdep prints this message about a lock that
is never taken in interrupt context.
After explaining the problems that lockdep is reporting, I
decided to add a description of the problem in visual form. Now
the following is shown:
---
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(lockA);
local_irq_disable();
lock(&rq->lock);
lock(lockA);
<Interrupt>
lock(&rq->lock);
*** DEADLOCK ***
---
The above is the case when the unsafe lock is taken while
holding a lock taken in irq context. But when a lock is taken
that also grabs a unsafe lock, the call chain is shown:
---
other info that might help us debug this:
Chain exists of:
&rq->lock --> lockA --> lockC
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(lockC);
local_irq_disable();
lock(&rq->lock);
lock(lockA);
<Interrupt>
lock(&rq->lock);
*** DEADLOCK ***
Signed-off-by: Steven Rostedt <rostedt@...dmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Frederic Weisbecker <fweisbec@...il.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Link: http://lkml.kernel.org/r/20110421014259.132728798@goodmis.org
Signed-off-by: Ingo Molnar <mingo@...e.hu>
---
kernel/lockdep.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 70 insertions(+), 0 deletions(-)
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 53a6895..7b2ffee 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -490,6 +490,18 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
usage[i] = '\0';
}
+static int __print_lock_name(struct lock_class *class)
+{
+ char str[KSYM_NAME_LEN];
+ const char *name;
+
+ name = class->name;
+ if (!name)
+ name = __get_key_name(class->key, str);
+
+ return printk("%s", name);
+}
+
static void print_lock_name(struct lock_class *class)
{
char str[KSYM_NAME_LEN], usage[LOCK_USAGE_CHARS];
@@ -1325,6 +1337,62 @@ print_shortest_lock_dependencies(struct lock_list *leaf,
return;
}
+static void
+print_irq_lock_scenario(struct lock_list *safe_entry,
+ struct lock_list *unsafe_entry,
+ struct held_lock *prev,
+ struct held_lock *next)
+{
+ struct lock_class *safe_class = safe_entry->class;
+ struct lock_class *unsafe_class = unsafe_entry->class;
+ struct lock_class *middle_class = hlock_class(prev);
+
+ if (middle_class == safe_class)
+ middle_class = hlock_class(next);
+
+ /*
+ * A direct locking problem where unsafe_class lock is taken
+ * directly by safe_class lock, then all we need to show
+ * is the deadlock scenario, as it is obvious that the
+ * unsafe lock is taken under the safe lock.
+ *
+ * But if there is a chain instead, where the safe lock takes
+ * an intermediate lock (middle_class) where this lock is
+ * not the same as the safe lock, then the lock chain is
+ * used to describe the problem. Otherwise we would need
+ * to show a different CPU case for each link in the chain
+ * from the safe_class lock to the unsafe_class lock.
+ */
+ if (middle_class != unsafe_class) {
+ printk("Chain exists of:\n ");
+ __print_lock_name(safe_class);
+ printk(" --> ");
+ __print_lock_name(middle_class);
+ printk(" --> ");
+ __print_lock_name(unsafe_class);
+ printk("\n\n");
+ }
+
+ printk(" Possible interrupt unsafe locking scenario:\n\n");
+ printk(" CPU0 CPU1\n");
+ printk(" ---- ----\n");
+ printk(" lock(");
+ __print_lock_name(unsafe_class);
+ printk(");\n");
+ printk(" local_irq_disable();\n");
+ printk(" lock(");
+ __print_lock_name(safe_class);
+ printk(");\n");
+ printk(" lock(");
+ __print_lock_name(middle_class);
+ printk(");\n");
+ printk(" <Interrupt>\n");
+ printk(" lock(");
+ __print_lock_name(safe_class);
+ printk(");\n");
+ printk("\n *** DEADLOCK ***\n\n");
+}
+
static int
print_bad_irq_dependency(struct task_struct *curr,
struct lock_list *prev_root,
@@ -1376,6 +1444,8 @@ print_bad_irq_dependency(struct task_struct *curr,
print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
printk("\nother info that might help us debug this:\n\n");
+ print_irq_lock_scenario(backwards_entry, forwards_entry, prev, next);
+
lockdep_print_held_locks(curr);
printk("\nthe dependencies between %s-irq-safe lock", irqclass);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists