linux-kernel - Re: bisect results of MSI-X related panic (help!)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.WNT.2.00.0910120907170.11804@jbrandeb-mobl2.amr.corp.intel.com>
Date:	Mon, 12 Oct 2009 11:00:33 -0700
From:	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>
To:	Tejun Heo <tj@...nel.org>
CC:	Jesse Brandeburg <jesse.brandeburg@...il.com>,
	Frans Pop <elendil@...net.nl>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>, "hpa@...or.com" <hpa@...or.com>
Subject: Re: bisect results of MSI-X related panic (help!)

On Mon, 12 Oct 2009, Tejun Heo wrote:
> > any other debugging tricks/ideas?
> 
> Hmm... stackprotector adds considerable amount of stack usage and it
> could be you're seeing stack overflow which would also explain the
> random crashes you've been seeing.  Do you have DEBUG_STACKOVERFLOW
> turned on?  This is on x86_64, right?

Hi, thanks for your response, 

[root@...andeb-hc linux-2.6.32-rc1]# grep STACKO .config
CONFIG_DEBUG_STACKOVERFLOW=y

[root@...andeb-hc linux-2.6.32-rc1]# grep X86_64 .config
CONFIG_X86_64=y
CONFIG_X86_64_SMP=y
CONFIG_X86_64_ACPI_NUMA=y

stack size is 8K

I tried Jarek's suggestion of CPUMASK_OFFSTACK and still panic.
[66027.266057] Kernel panic - not syncing: stack-protector: Kernel stack 
is corrupted in: ffffffff810b4eb0
[66027.266059]
[66027.266070] Kernel panic - not syncing: stack-protector: Kernel stack 
is corrupted in: ffffffff81472856
[66027.266071]
[66027.266081] Pid: 0, comm: swapper Tainted: G        W  
2.6.32-rc2-git-debug #6
[66027.266086] Call Trace:

that was all I got.  Interesting double fault, that hadn't happened 
before.

the symbols might be off slightly since I rebuilt the kernel, but this was 
initial poke at offsets above in gdb
(gdb) l *0xffffffff810b4eb0
0xffffffff810b4eb0 is in dynamic_irq_cleanup (kernel/irq/chip.c:86).
81              desc->handle_irq = handle_bad_irq;
82              desc->chip = &no_irq_chip;
83              desc->name = NULL;
84              clear_kstat_irqs(desc);
85              spin_unlock_irqrestore(&desc->lock, flags);
86      }
87
88
89      /**
90       *      set_irq_chip - set the irq chip for an irq
(gdb) l *0xffffffff8147285
No source file for address 0xffffffff8147285.
(gdb) l *0xffffffff81472856
0xffffffff81472856 is in show_kprobe_addr (kernel/kprobes.c:1306).
1301            struct hlist_head *head;
1302            struct hlist_node *node;
1303            struct kprobe *p, *kp;
1304            const char *sym = NULL;
1305            unsigned int i = *(loff_t *) v;
1306            unsigned long offset = 0;
1307            char *modname, namebuf[128];
1308
1309            head = &kprobe_table[i];
1310            preempt_disable();


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/