netdev - Re: bisect results of MSI-X related panic (help!)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4AD3E875.5040800@kernel.org>
Date:	Tue, 13 Oct 2009 11:39:49 +0900
From:	Tejun Heo <tj@...nel.org>
To:	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>
CC:	Jesse Brandeburg <jesse.brandeburg@...il.com>,
	Frans Pop <elendil@...net.nl>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>, "hpa@...or.com" <hpa@...or.com>
Subject: Re: bisect results of MSI-X related panic (help!)

Brandeburg, Jesse wrote:
> On Mon, 12 Oct 2009, Tejun Heo wrote:
>>> any other debugging tricks/ideas?
>> Hmm... stackprotector adds considerable amount of stack usage and it
>> could be you're seeing stack overflow which would also explain the
>> random crashes you've been seeing.  Do you have DEBUG_STACKOVERFLOW
>> turned on?  This is on x86_64, right?
> 
> Hi, thanks for your response, 
> 
> [root@...andeb-hc linux-2.6.32-rc1]# grep STACKO .config
> CONFIG_DEBUG_STACKOVERFLOW=y
> 
> [root@...andeb-hc linux-2.6.32-rc1]# grep X86_64 .config
> CONFIG_X86_64=y
> CONFIG_X86_64_SMP=y
> CONFIG_X86_64_ACPI_NUMA=y
> 
> stack size is 8K
> 
> I tried Jarek's suggestion of CPUMASK_OFFSTACK and still panic.
> [66027.266057] Kernel panic - not syncing: stack-protector: Kernel stack 
> is corrupted in: ffffffff810b4eb0
> [66027.266059]
> [66027.266070] Kernel panic - not syncing: stack-protector: Kernel stack 
> is corrupted in: ffffffff81472856
> [66027.266071]
> [66027.266081] Pid: 0, comm: swapper Tainted: G        W  
> 2.6.32-rc2-git-debug #6
> [66027.266086] Call Trace:
> 
> that was all I got.  Interesting double fault, that hadn't happened 
> before.
> 
> the symbols might be off slightly since I rebuilt the kernel, but this was 
> initial poke at offsets above in gdb
> (gdb) l *0xffffffff810b4eb0
> 0xffffffff810b4eb0 is in dynamic_irq_cleanup (kernel/irq/chip.c:86).
> 81              desc->handle_irq = handle_bad_irq;
> 82              desc->chip = &no_irq_chip;
> 83              desc->name = NULL;
> 84              clear_kstat_irqs(desc);
> 85              spin_unlock_irqrestore(&desc->lock, flags);
> 86      }

Can you please apply the following patch and try to retrigger the
panic?

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index c166019..f5a1482 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -63,6 +63,9 @@ void dynamic_irq_cleanup(unsigned int irq)
 	struct irq_desc *desc = irq_to_desc(irq);
 	unsigned long flags;

+	printk("XXX dynamic_irq_cleanup() called on %u\n", irq);
+	dump_stack();
+
 	if (!desc) {
 		WARN(1, KERN_ERR "Trying to cleanup invalid IRQ%d\n", irq);
 		return;

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html