linux-kernel - Re: [PATCH] dump

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.58.0710310729130.13523@gandalf.stny.rr.com>
Date:	Wed, 31 Oct 2007 07:39:05 -0400 (EDT)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Andi Kleen <andi@...stfloor.org>
cc:	Christoph Hellwig <hch@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>, Daniel Walker <dwalker@...sta.com>
Subject: Re: [PATCH] dump_stack on panic

--
On Wed, 31 Oct 2007, Andi Kleen wrote:

> > Thinking about this more. Of the Linux users I asked, they think a kernel
> > panic is a bug in the kernel anyway (or at least something in the kernel
> > went wrong).  So if panics will point users to the kernel anyway, then why
> > leave out possible vital information from those panics that were caused by
> > an actual kernel bug.
>
> On a no root found panic (which are likely the majority of all panics)
> the stack dump is about 100% useless.
>
> I also think it's a very bad idea to add them to machine checks again
> for psychological reasons.
>

I believe that kernel programmers are horrible psychologists.

Doing a quick search yields:

find .  -name "*.c" ! -type d  | xargs grep  "panic(" |wc -l
1143

So we have 1143 uses of panic. I'll grant you that the most common
occurrence is the unable to mount root. But I don't think users are less
likely to blame the kernel when they see a dump or just see a panic. A
panic usually implies (to me anyway) something went wrong and the kernel
can't cope.

But by not having a dump, we lose out on real bugs that are stopped by the
panics. The most common one that I come across is a fault in the interrupt
handler of some device. This produces a panic screaming killing an
interrupt. But without a trace, we have no idea where that bug occurred.

So do we really want to play shrink and say it's better to leave out the
back trace because we might scare users, or should we keep it for the few
times it really does help?

IMHO I believe we should do the dump_stack now, and come up with another
solution to the mount root panic.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/