[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <04EAB7311EE43145B2D3536183D1A84454A3DF12@GSjpTKYDCembx31.service.hitachi.net>
Date: Thu, 3 Dec 2015 11:29:21 +0000
From: 河合英宏 / KAWAI,HIDEHIRO
<hidehiro.kawai.ez@...achi.com>
To: "'Borislav Petkov'" <bp@...en8.de>
CC: Jonathan Corbet <corbet@....net>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Vivek Goyal <vgoyal@...hat.com>, Baoquan He <bhe@...hat.com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Michal Hocko <mhocko@...nel.org>,
平松雅巳 / HIRAMATU,MASAMI
<masami.hiramatsu.pt@...achi.com>
Subject: RE: [V5 PATCH 3/4] kexec: Fix race between panic() and
crash_kexec() called directly
> On Thu, Dec 03, 2015 at 02:01:38AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > On Wed, Dec 02, 2015 at 11:57:38AM +0000, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > > We can do so, but I think resetting panic_cpu always would be
> > > > simpler and safer.
> >
> > I'll state in detail.
> >
> > When we call crash_kexec() without entering panic() and return from
> > it, panic() should be called eventually.
>
> Huh, the call chain is
>
> panic->crash_kexec
>
> Or do you mean, when crash_kexec() is not called by panic() but by some
> of its other callers?
I was arguing about the case of oops_end --> crash_kexec
--> return from crash_kexec because of !kexec_crash_image -->
panic.
In the case of panic --> __crash_kexec, __crash_kexec is called
only once, so we don't need to check the return value of __crash_kexec
as you suggested. So I thought you stated about crash_kexec --> panic
case.
> > But the code paths are a bit complicated and there are many
> > implementations for each architecture. So one day, this assumption may
> > be broken; the CPU doesn't call panic(). Or the CPU may fail to call
> > panic() because we are already in insane state. It would be nervous,
> > but allowing another CPU to process panic routines by resetting
> > panic_cpu is safer approach.
>
> My suggestion was to do this only on the panic path - not necessarily on
> the others.
>
> > Since this code is executed only once due to panic_cpu,
> > I think introducing this logic is not much valuable.
> > Also, current implementation is already quite simple:
> >
> > panic()
> > {
> > ...
> > __crash_kexec(NULL) {
> > if (mutex_trylock(&kexec_mutex)) {
> > if (kexec_crash_image) {
> > /* don't return */
> > }
>
> I don't mean the kexec_crash_image case - I mean the opposite one:
> !kexec_crash_image.
I also mentioned !kexec_crash_image case...
> And I think I know now what you're trying to tell
> me: the first CPU which hits panic, will finish panic eventually and so
> it will take down the machine.
No. The first CPU calls panic, and then it calls __crash_kexec.
Because of !kexec_crash_image, it returns from __crash_kexec and
continues to the panic procedure. At the same time, another CPU
tries to call panic(), but it doesn't run the panic procedure;
panic_cpu prevents the second CPU from running it.
This means __crash_kexec is called only once even if we don't
check the return value of __crash_kexec.
(Please note that crash_kexec can be called multiple times in the
case of oops_end() --> crash_kexec().)
I'm sorry I couldn't tell my thought well.
Regards,
--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group
Powered by blists - more mailing lists