[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5215CDEF.30004@hitachi.com>
Date: Thu, 22 Aug 2013 17:38:07 +0900
From: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@...achi.com>
To: Don Zickus <dzickus@...hat.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
Andi Kleen <ak@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, Gleb Natapov <gleb@...hat.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Joerg Roedel <joro@...tes.org>, x86@...nel.org,
stable@...r.kernel.org, Marcelo Tosatti <mtosatti@...hat.com>,
Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
Sebastian Andrzej Siewior <sebastian@...akpoint.cc>,
Ingo Molnar <mingo@...hat.com>,
Zhang Yanfei <zhangyanfei@...fujitsu.com>,
yrl.pp-manager.tt@...achi.com,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Thomas Gleixner <tglx@...utronix.de>,
Seiji Aguchi <seiji.aguchi@....com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: Re: [PATCH] [BUGFIX] crash/ioapic: Prevent crash_kexec() from
deadlocking of ioapic_lock
(2013/08/20 23:27), Don Zickus wrote:
> On Tue, Aug 20, 2013 at 03:12:32AM -0700, Eric W. Biederman wrote:
>> Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@...achi.com> writes:
>>
>>> Hi Ingo,
>>>
>>> Thank you for fixing typos!
>>> OK, I'll fix them and rename to ioapic_zap_locks().
>>>
>>> Thank you again!
>>
>>
>> The better fix for this would be to remove the disable_IO_APIC call from
>> crash_kexec.
>>
>> I know last time it was investigated the kernel was very close to
>> working without needing that, and the code will be much more robust in
>> the long term if we can avoid disabling them in the crashing kernel.
>>
>> Yoshihiro is there any chance you can look into removing the
>> disable_IO_APIC entirely?
>>
>> The apic disablement and the disable_IO_APIC exists entirely due to
>> limitations in the kernel boot path.
>
> Yup. We went down this path a year ago:
>
> https://lkml.org/lkml/2012/2/2/331
>
> Then we got sidetracked and talked about removing the lapic stuff at
> shutdown too:
>
> http://lists.infradead.org/pipermail/kexec/2012-February/006017.html
> (sorry couldn't find lkml link for some reason)
>
> And the second patch was committed.
>
> However, it was quickly reverted when Yinghai Lu noticed a problem:
>
> https://lkml.org/lkml/2012/2/11/143
>
> The problem stemmed from the fact that the nmi_watchdog caused an NMI in
> the middle of transitioning between the two kernels (we didn't shutdown
> the lapic) and caused a reset (there is no NMI handler in purgatory).
>
> I think I dropped the ball in investigating how to write an idt for the
> purgatory code to handle spurious NMIs.
>
> Regardless of all that, I think if we stick to just removing the ioapic
> shutdown code (ie the first patch linked above), we should be ok. I
> believe my testing went smoothly. It was the lapic stuff that needed more
> tweaking.
>
> So, I agree with Eric, let's remove the disable_IO_APIC() stuff and keep
> the code simpler.
Thank you for commenting about my patch.
I didn't know you already have submitted the patches for this deadlock
problem.
I can't answer definitively right now that no problems are induced by
removing disable_IO_APIC(). However, my patch should be work well (and
has already been merged to -tip tree). So how about taking my patch at
first, and then discussing the removal of disabled_IO_APIC()?
Thanks,
Yoshihiro YUNOMAE
--
Yoshihiro YUNOMAE
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: yoshihiro.yunomae.ez@...achi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists