[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150714175527.GI10792@redhat.com>
Date:	Tue, 14 Jul 2015 13:55:27 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	dwalker@...o99.com
Cc:	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-mips@...ux-mips.org, Baoquan He <bhe@...hat.com>,
	linux-sh@...r.kernel.org, linux-s390@...r.kernel.org,
	kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
	Ingo Molnar <mingo@...nel.org>,
	HATAYAMA Daisuke <d.hatayama@...fujitsu.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	linuxppc-dev@...ts.ozlabs.org, linux-metag@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 1/3] panic: Disable crash_kexec_post_notifiers if kdump
 is not available
On Tue, Jul 14, 2015 at 05:29:53PM +0000, dwalker@...o99.com wrote:
[..]
> > >> > If a machine is failing, there are high chance it can't deliver you the
> > >> > notification. Detecting that failure suing some kind of polling mechanism
> > >> > might be more reliable. And it will make even kdump mechanism more
> > >> > reliable so that it does not have to run panic notifiers after the crash.
> > >> 
> > >> I think what your suggesting is that my company should change how it's hardware works
> > >> and that's not really an option for me. This isn't a simple thing like checking over the
> > >> network if the machine is down or not, this is way more complex hardware design.
> > >
> > > That means you are ready to live with an unreliable design. There might be
> > > cases where notifier does not get run properly and you will not do switch
> > > despite the fact that OS has failed. I was just trying to nudge you in
> > > a direction which could be more reliable mechanism.
> > 
> > Sigh I see some deep confusion going on here.
> > 
> > The panic notifiers are just that panic notifiers.  They have not been
> > nor should they be tied to kexec.   If those notifiers force a switch
> > over of between machines I fail to see why you would care if it was
> > kexec or another panic situation that is forcing that switchover.
> 
> Hidehiro isn't fixing the failover situation on my side, he's fixing register
> information collection when crash_kexec_post_notifiers is used.
Sure. Given that we have created this new parameter, let us fix it so that
we can capture the other cpu register state in crash dump.
I am little disappointed that it was not tested well when this parameter was
introuced. We should have atleast tested it to the extent to see if there
is proper cpu state present for all cpus in the crash dump.
At that point of time it looked like a simple modification
to allow panic notifiers before crash_kexec().
Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
