[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090819091820.d55e3353.akpm@linux-foundation.org>
Date: Wed, 19 Aug 2009 09:18:20 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: [BUG] lockup with the latest kernel
On Wed, 19 Aug 2009 11:49:25 -0400 (EDT) Steven Rostedt <rostedt@...dmis.org> wrote:
> Always happens where one CPU is sending an IPI and the other has the rq
> spinlock. Seems to be that the IPI expects the other CPU to not have
> interrupts disabled or something?
>
> Note, I've seen this on 2.6.30-rc6 as well (yes that's 2.6.30). But this
> does not happen on 2.6.29. Unfortunately, 2.6.29 makes my NIC go kaputt
> for some reason.
>
> I've enabled LOCKDEP and it just makes the bug trigger easier.
>
> Anyway, anyone have any ideas?
We'd need to see the backtrace on the target CPU.
It shouldn't be too hard - set that CPU's bit in
arch/x86/kernel/apic/nmi.c:backtrace_mask and then clear it again when
that CPU has responded.
Or even:
diff -puN arch/x86/kernel/apic/nmi.c~a arch/x86/kernel/apic/nmi.c
--- a/arch/x86/kernel/apic/nmi.c~a
+++ a/arch/x86/kernel/apic/nmi.c
@@ -387,6 +387,8 @@ void touch_nmi_watchdog(void)
}
EXPORT_SYMBOL(touch_nmi_watchdog);
+extern int wizzle;
+
notrace __kprobes int
nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
{
@@ -415,7 +417,8 @@ nmi_watchdog_tick(struct pt_regs *regs,
}
/* We can be called before check_nmi_watchdog, hence NULL check. */
- if (backtrace_mask != NULL && cpumask_test_cpu(cpu, backtrace_mask)) {
+ if (cpu == wizzle ||
+ (backtrace_mask != NULL && cpumask_test_cpu(cpu, backtrace_mask))) {
static DEFINE_SPINLOCK(lock); /* Serialise the printks */
spin_lock(&lock);
diff -puN arch/x86/kernel/smp.c~a arch/x86/kernel/smp.c
--- a/arch/x86/kernel/smp.c~a
+++ a/arch/x86/kernel/smp.c
@@ -111,13 +111,17 @@
* it goes straight through and wastes no time serializing
* anything. Worst case is that we lose a reschedule ...
*/
+int wizzle = -1;
+
static void native_smp_send_reschedule(int cpu)
{
if (unlikely(cpu_is_offline(cpu))) {
WARN_ON(1);
return;
}
+ wizzle = cpu;
apic->send_IPI_mask(cpumask_of(cpu), RESCHEDULE_VECTOR);
+ wizzle = -1;
}
void native_send_call_func_single_ipi(int cpu)
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists