lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 1 Oct 2015 07:01:50 +0000
From:	河合英宏 / KAWAI,HIDEHIRO 
	<hidehiro.kawai.ez@...achi.com>
To:	"'Peter Zijlstra'" <peterz@...radead.org>
CC:	Jonathan Corbet <corbet@....net>, Ingo Molnar <mingo@...nel.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Vivek Goyal <vgoyal@...hat.com>,
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
	"x86@...nel.org" <x86@...nel.org>,
	"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Michal Hocko <mhocko@...nel.org>,
	Ingo Molnar <mingo@...hat.com>,
	平松雅巳 / HIRAMATU,MASAMI 
	<masami.hiramatsu.pt@...achi.com>
Subject: RE: [V4 PATCH 4/4] x86/apic: Introduce noextnmi boot option

> On Thu, Oct 01, 2015 at 02:33:18AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > On Fri, Sep 25, 2015 at 08:28:11PM +0900, Hidehiro Kawai wrote:
> > > > This patch introduces new boot option "noextnmi" which disables
> > > > external NMI.  This option is useful for the dump capture kernel
> > > > so that an HA application or administrator wouldn't mistakenly
> > > > shoot down the kernel by NMI.
> > >
> > > So that they can get really stuck when the crash kernel crashes, right?
> > > ;-)
> >
> > No, it is different from my intention.
> >
> > `mistakenly' in the above means; they issue NMI due to a misconception
> > that the monitored host is stuck in the 1st kernel while it is actually
> > taking a crash dump in the 2nd kernel.  To avoid this kind of accident,
> > there is a tool such as fence_kdump which notifies "I'm taking a crash
> > dump, so don't send NMI" to the HA clustering software.  However, there
> > is a time window between kernel panic and the notification.
> >
> > "noextnmi" allows users to avoid this kind of accident all the time of
> > 2nd kernel.
> 
> Yes yes, I understand. But if the crash kernel also gets stuck they have
> no means of recovery, right? (other than power cycling the hardware)

Yes, but I think it's not a big problem.

I suppose that a sever which uses this feature will equip a BMC
and BMC mandatorily supports hard reset command for the server.
If the HA clustering software detects no response from the server
after relatively long timeout, it might want to insert hard reset
to the server by IPMI over LAN.

> Just playing devils advocate here, I don't actually object to the patch.

Regards,

Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ