linux-kernel - Re: Top 10 kernel oopses for the week ending January 5th, 2008

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20080108081401.d9576ac5.randy.dunlap@oracle.com>
Date:	Tue, 8 Jan 2008 08:14:01 -0800
From:	Randy Dunlap <randy.dunlap@...cle.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Kevin Winchester <kjwinchester@...il.com>,
	"J. Bruce Fields" <bfields@...ldses.org>,
	Al Viro <viro@...IV.linux.org.uk>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	NetDev <netdev@...r.kernel.org>
Subject: Re: Top 10 kernel oopses for the week ending January 5th, 2008

On Mon, 7 Jan 2008 19:26:12 -0800 (PST) Linus Torvalds wrote:

> On Mon, 7 Jan 2008, Kevin Winchester wrote:
> 
> > J. Bruce Fields wrote:
> > > 
> > > Is there any good basic documentation on this to point people at?
> > 
> > I would second this question.  I see people "decode" oops on lkml often 
> > enough, but I've never been entirely sure how its done.  Is it somewhere 
> > in Documentation?
> 
> It's actually not necessarily at all that trivial, unless you have a deep 
> understanding of the code generated for the architecture in question (and 
> even then, some oopses take more time to figure out than others, thanks 
> to inlining and tailcalls etc).
> 
> If the oops happened with a kernel you generated yourself, it's usually 
> rather easy. Especially if you said "y" to the "generate debugging info" 
> question at configuration time. Because, in that case, you really just do 
> a simple
> 
> 	gdb vmlinux
> 
> and then you can do (for example) something like setting a breakpoint at 
> the EIP that was reported for the oops, and it will tell you what line it 
> came from.
> 
> However, if you don't have the exact binary - which is the common case for 
> random oopses reported on lkml - you will generally have to disassemble 
> the hex sequence given in the oops (the "Code:" line), and try to match it 
> up against the source code to try to figure out what is going on.
> 
> Even just the disassembly is not entirely trivial, since the oops will 
> give you the eip that it happened at, but you often want to also 
> disassemble *backwards* in order to get more of a context (the "Code:" 
> line will mark the particular EIP that starts the oopsing instruction by 
> enclosing it in <xx>, but with non-constant instruction lengths, you need 
> to use a bit of trial-and-error to figure it out.
> 
> I usually just compile a small program like
> 
> 	const char array[]="\xnn\xnn\xnn...";
> 
> 	int main(int argc, char **argv)
> 	{
> 		printf("%p\n", array);
> 		*(int *)0=0;
> 	}
> 
> and run it under gdb, and then when it gets the SIGSEGV (due to the 
> obvious NULL pointer dereference), I can just ask gdb to disassemble 
> around the array that contains the code[] stuff. Try a few offsets, to see 
> when the disassembly makes sense (and gives the reported EIP as the 
> beginning of one of the disassembled instructions).
> 
> (You can do it other and smarter ways too, I'm not claiming that's a 
> particularly good way to do it, and the old "ksymoops" program used to do 
> a pretty good job of this, but I'm used to that particular idiotic way 
> myself, since it's how I've basically always done it)

One other way to do it (at least for x86-32/64) is to use
$kerneltree/scripts/decodecode.  It may work on other $arches also,
but I haven't tested it on others.

---
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/