[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130820142740.GO239280@redhat.com>
Date: Tue, 20 Aug 2013 10:27:40 -0400
From: Don Zickus <dzickus@...hat.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@...achi.com>,
Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
Andi Kleen <ak@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, Gleb Natapov <gleb@...hat.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Joerg Roedel <joro@...tes.org>, x86@...nel.org,
stable@...r.kernel.org, Marcelo Tosatti <mtosatti@...hat.com>,
Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
Sebastian Andrzej Siewior <sebastian@...akpoint.cc>,
Ingo Molnar <mingo@...hat.com>,
Zhang Yanfei <zhangyanfei@...fujitsu.com>,
yrl.pp-manager.tt@...achi.com,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Thomas Gleixner <tglx@...utronix.de>,
Seiji Aguchi <seiji.aguchi@....com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] [BUGFIX] crash/ioapic: Prevent crash_kexec() from
deadlocking of ioapic_lock
On Tue, Aug 20, 2013 at 03:12:32AM -0700, Eric W. Biederman wrote:
> Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@...achi.com> writes:
>
> > Hi Ingo,
> >
> > Thank you for fixing typos!
> > OK, I'll fix them and rename to ioapic_zap_locks().
> >
> > Thank you again!
>
>
> The better fix for this would be to remove the disable_IO_APIC call from
> crash_kexec.
>
> I know last time it was investigated the kernel was very close to
> working without needing that, and the code will be much more robust in
> the long term if we can avoid disabling them in the crashing kernel.
>
> Yoshihiro is there any chance you can look into removing the
> disable_IO_APIC entirely?
>
> The apic disablement and the disable_IO_APIC exists entirely due to
> limitations in the kernel boot path.
Yup. We went down this path a year ago:
https://lkml.org/lkml/2012/2/2/331
Then we got sidetracked and talked about removing the lapic stuff at
shutdown too:
http://lists.infradead.org/pipermail/kexec/2012-February/006017.html
(sorry couldn't find lkml link for some reason)
And the second patch was committed.
However, it was quickly reverted when Yinghai Lu noticed a problem:
https://lkml.org/lkml/2012/2/11/143
The problem stemmed from the fact that the nmi_watchdog caused an NMI in
the middle of transitioning between the two kernels (we didn't shutdown
the lapic) and caused a reset (there is no NMI handler in purgatory).
I think I dropped the ball in investigating how to write an idt for the
purgatory code to handle spurious NMIs.
Regardless of all that, I think if we stick to just removing the ioapic
shutdown code (ie the first patch linked above), we should be ok. I
believe my testing went smoothly. It was the lapic stuff that needed more
tweaking.
So, I agree with Eric, let's remove the disable_IO_APIC() stuff and keep
the code simpler.
Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists