[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150714172953.GA19135@fifo99.com>
Date: Tue, 14 Jul 2015 17:29:53 +0000
From: dwalker@...o99.com
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Vivek Goyal <vgoyal@...hat.com>,
Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-mips@...ux-mips.org, Baoquan He <bhe@...hat.com>,
linux-sh@...r.kernel.org, linux-s390@...r.kernel.org,
kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...nel.org>,
HATAYAMA Daisuke <d.hatayama@...fujitsu.com>,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
linuxppc-dev@...ts.ozlabs.org, linux-metag@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 1/3] panic: Disable crash_kexec_post_notifiers if kdump
is not available
On Tue, Jul 14, 2015 at 12:06:15PM -0500, Eric W. Biederman wrote:
> Vivek Goyal <vgoyal@...hat.com> writes:
>
> > On Tue, Jul 14, 2015 at 03:48:33PM +0000, dwalker@...o99.com wrote:
> >> On Tue, Jul 14, 2015 at 11:40:40AM -0400, Vivek Goyal wrote:
> >> > On Tue, Jul 14, 2015 at 03:34:30PM +0000, dwalker@...o99.com wrote:
> >> > > On Tue, Jul 14, 2015 at 11:02:08AM -0400, Vivek Goyal wrote:
> >> > > > On Tue, Jul 14, 2015 at 01:59:19PM +0000, dwalker@...o99.com wrote:
> >> > > > > On Mon, Jul 13, 2015 at 08:19:45PM -0500, Eric W. Biederman wrote:
> >> > > > > > dwalker@...o99.com writes:
> >> > > > > >
> >> > > > > > > On Fri, Jul 10, 2015 at 08:41:28AM -0500, Eric W. Biederman wrote:
> >> > > > > > >> Hidehiro Kawai <hidehiro.kawai.ez@...achi.com> writes:
> >> > > > > > >>
> >> > > > > > >> > You can call panic notifiers and kmsg dumpers before kdump by
> >> > > > > > >> > specifying "crash_kexec_post_notifiers" as a boot parameter.
> >> > > > > > >> > However, it doesn't make sense if kdump is not available. In that
> >> > > > > > >> > case, disable "crash_kexec_post_notifiers" boot parameter so that
> >> > > > > > >> > you can't change the value of the parameter.
> >> > > > > > >>
> >> > > > > > >> Nacked-by: "Eric W. Biederman" <ebiederm@...ssion.com>
> >> > > > > > >
> >> > > > > > > I think it would make sense if he just replaced "kdump" with "kexec".
> >> > > > > >
> >> > > > > > It would be less insane, however it still makes no sense as without
> >> > > > > > kexec on panic support crash_kexec is a noop. So the value of the
> >> > > > > > seeting makes no difference.
> >> > > > >
> >> > > > > Can you explain more, I don't really understand what you mean. Are you suggesting
> >> > > > > the whole "crash_kexec_post_notifiers" feature has no value ?
> >> > > >
> >> > > > Daniel,
> >> > > >
> >> > > > BTW, why are you using crash_kexec_post_notifiers commandline? Why not
> >> > > > without it?
> >> > >
> >> > > It was explained in the prior thread but to rehash, the notifiers are used to do a switch
> >> > > over from the crashed machine to another redundant machine.
> >> >
> >> > So why not detect failure using polling or issue notifications from second
> >> > kernel.
> >> >
> >> > IOW, expecting that a crashed machine will be able to deliver notification
> >> > reliably is falwed to begin with, IMHO.
> >>
> >> It's flawed to think you can kexec, but you still do it right ? I've not gotten into
> >> the deep details of this switching process, but that's how this interface is used.
> >
> > Sure. But the deal here is that users of interface know that sometimes it
> > can be unreliable. And in the absence of more reliable mechanism, somewhat
> > less reliable mechanism is fine.
> >
> >>
> >> > If a machine is failing, there are high chance it can't deliver you the
> >> > notification. Detecting that failure suing some kind of polling mechanism
> >> > might be more reliable. And it will make even kdump mechanism more
> >> > reliable so that it does not have to run panic notifiers after the crash.
> >>
> >> I think what your suggesting is that my company should change how it's hardware works
> >> and that's not really an option for me. This isn't a simple thing like checking over the
> >> network if the machine is down or not, this is way more complex hardware design.
> >
> > That means you are ready to live with an unreliable design. There might be
> > cases where notifier does not get run properly and you will not do switch
> > despite the fact that OS has failed. I was just trying to nudge you in
> > a direction which could be more reliable mechanism.
>
> Sigh I see some deep confusion going on here.
>
> The panic notifiers are just that panic notifiers. They have not been
> nor should they be tied to kexec. If those notifiers force a switch
> over of between machines I fail to see why you would care if it was
> kexec or another panic situation that is forcing that switchover.
Hidehiro isn't fixing the failover situation on my side, he's fixing register
information collection when crash_kexec_post_notifiers is used.
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists