[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04EAB7311EE43145B2D3536183D1A8445493C80C@GSjpTKYDCembx31.service.hitachi.net>
Date: Sat, 22 Aug 2015 01:43:00 +0000
From: 河合英宏 / KAWAI,HIDEHIRO
<hidehiro.kawai.ez@...achi.com>
To: "'Peter Zijlstra'" <peterz@...radead.org>
CC: Jonathan Corbet <corbet@....net>, Ingo Molnar <mingo@...nel.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Vivek Goyal <vgoyal@...hat.com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Michal Hocko <mhocko@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
平松雅巳 / HIRAMATU,MASAMI
<masami.hiramatsu.pt@...achi.com>
Subject: RE: [V3 PATCH 2/4] panic/x86: Allow cpus to save registers even if
they are looping in NMI context
> From: Peter Zijlstra [mailto:peterz@...radead.org]
>
> On Thu, Aug 06, 2015 at 02:45:43PM +0900, Hidehiro Kawai wrote:
> > When cpu-A panics on NMI just after cpu-B has panicked, cpu-A loops
> > infinitely in NMI context. Especially for x86, cpu-B issues NMI IPI
> > to other cpus to save their register states and do some cleanups if
> > kdump is enabled, but cpu-A can't handle the NMI and fails to save
> > register states.
> >
> > To solve thie issue, we wait for the timing of the NMI IPI, then
> > call the NMI handler which saves register states.
>
> Sorry, I don't follow, what?
First, a subroutine of crash_kexec(), nmi_shootdown_cpus()
send NMI IPI to non-panic cpus to stop them while saving their
registers ans doing some cleanups for crash dumping. So if a non-panic
cpu is looping in NMI context infinitely at that time, we fail to save
its register information and lose the information from the crash dump.
`Infinite loop in NMI context' can happen when panic on NMI is about
to happen while another cpu has already been processing panic().
To save regs and do some cleanups in that case too, this patch does
two things:
1. Moves the timing of `infinite loop in NMI context' (actually
panic_smp_self_stop()) outside of panic() to keep the pt_regs object
2. call a callback of nmi_shootdown_cpus() directly to save regs and
do some cleanups after setting waiting_for_crash_ipi which is used
for counting down the number of cpus which handled the callback
Does that answer your question?
Regards,
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group
Powered by blists - more mailing lists