linux-kernel - Re: bisect results of MSI-X related panic (help!)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4AD2E05A.6060700@kernel.org>
Date:	Mon, 12 Oct 2009 16:52:58 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Jesse Brandeburg <jesse.brandeburg@...il.com>
CC:	Frans Pop <elendil@...net.nl>,
	Jesse Brandeburg <jesse.brandeburg@...el.com>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	Ingo Molnar <mingo@...e.hu>, hpa@...or.com
Subject: Re: bisect results of MSI-X related panic (help!)

Jesse Brandeburg wrote:
> Kernel stack is corrupted in: ffffffff810b5b31
> 
> I've built with a full debug kernel before this crash, so I did:
> 
> (gdb) l *0xffffffff810b5b31
> 0xffffffff810b5b31 is in move_native_irq (kernel/irq/migration.c:67).
> 62			return;
> 63	
> 64		desc->chip->mask(irq);
> 65		move_masked_irq(irq);
> 66		desc->chip->unmask(irq);
>>>> 67	}
> 68	
> (gdb) l move_native_irq
> 54	void move_native_irq(int irq)
> 55	{
> 56		struct irq_desc *desc = irq_to_desc(irq);
> 57	
> 58		if (likely(!(desc->status & IRQ_MOVE_PENDING)))
> 59			return;
> 60	
> 61		if (unlikely(desc->status & IRQ_DISABLED))
> 62			return;
> 63	
> 64		desc->chip->mask(irq);
> 65		move_masked_irq(irq);
> 66		desc->chip->unmask(irq);
> 67	}
> 
> So, this seems very related to my panic, as it is likely that
> irqbalance or something else might try to move my interrupt from one
> core to another and this seems likely related, and the original issue
> as well as this one reproduce with LOTS of MSI-X vectors active.
> 
> - I tried connecting after the panic with kgdboc, no connection
> - I tried kdump, but the same kernel I am using panics/hangs during
> boot right after udev during the kexec() kernel boot (should I try
> harder to get this working given it got so far?)
> - I have ftrace function tracer running but no way to get at the log
> post panic (wouldn't it be great if the kernel just dumped the ftrace
> log on __stack_chk_fail?)
> 
> any other debugging tricks/ideas?

Hmm... stackprotector adds considerable amount of stack usage and it
could be you're seeing stack overflow which would also explain the
random crashes you've been seeing.  Do you have DEBUG_STACKOVERFLOW
turned on?  This is on x86_64, right?

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/