[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091012151431.GC14004@elte.hu>
Date: Mon, 12 Oct 2009 17:14:31 +0200
From: Ingo Molnar <mingo@...e.hu>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Alan Cox <alan@...rguk.ukuu.org.uk>,
Simon Kagstrom <simon.kagstrom@...insight.net>,
Artem Bityutskiy <dedekind1@...il.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"Koskinen Aaro (Nokia-D/Helsinki)" <aaro.koskinen@...ia.com>,
linux-mtd <linux-mtd@...ts.infradead.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] panic.c: export panic_on_oops
* David Woodhouse <dwmw2@...radead.org> wrote:
> On Mon, 2009-10-12 at 16:26 +0200, Ingo Molnar wrote:
> > Not if the failure is say a s2ram hang that requires a power cycle.
> > Also there are certain classes of bugs that only occur on cold boot.
> > Plus there's the "need to unplug the battery to revive the system"
> > class of bugs (but they are rare).
>
> So you need to build in enough ECC to cope with the decay which
> happens when RAM isn't being refreshed for a few seconds... :)
[ hey, i think you should line up with BIOS writers at that wall ;-) ]
> > So i think the MTD / flash stuff is powerful.
>
> Yeah, definitely. I was just pointing out that we can actually do a
> lot better on today's commodity hardware too.
I wish it worked on any of the 10+ x86 systems i have. Is there anyone
who'd be interested in exploring whether warm BIOS reboots work
_anywhere_?
A simple patch with a new (default-off) CONFIG_DEBUG_ feature that just
puts a signature into a predictable spot in RAM, switches the reboot
method over to warm reboot (reboot=w) and prints some friendly "yay,
this BIOS rocks!" message if the signature is still there after a reboot
and not zeroed out.
If that works _anywhere_ we could complete it: we could cache the dmesg
buffer address (__log_buf[]) across reboots (and maybe the printk tail
offset (log_end)), and that would be an _excellent_ debuggability
feature for a large class of otherwise undebuggable crashes ...
We could use that to preserve a kernel function trace (or a branch
execution hardware trace using BTS on Intel CPUs) across crashes, etc.
etc.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists