[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1470305688.2274.85.camel@localhost>
Date: Thu, 4 Aug 2016 10:17:59 +0000
From: "Wei, Jiangang" <weijg.fnst@...fujitsu.com>
To: "ebiederm@...ssion.com" <ebiederm@...ssion.com>
CC: "kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Cao, Jin" <caoj.fnst@...fujitsu.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"bhe@...hat.com" <bhe@...hat.com>,
"xpang@...hat.com" <xpang@...hat.com>,
"kernel@...p.com" <kernel@...p.com>,
"x86@...nel.org" <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
"mingo@...hat.com" <mingo@...hat.com>,
"joro@...tes.org" <joro@...tes.org>
Subject: Re: [PATCH v2 3/3] x86/apic: Improved the setting of interrupt mode
for bsp
Hi Eric,
I have several questions about kdump and APIC mode.
specific issues at the end of the mail.
On Tue, 2016-08-02 at 09:26 -0500, Eric W. Biederman wrote:
> "Wei, Jiangang" <weijg.fnst@...fujitsu.com> writes:
>
> > Hi Eric,
> >
> > Thanks for your response.
> > But I have some different ideas...
>
> Apologies for not replying to this earlier your reply got lost in my
> spam folder and I overlooked it.
>
> > On Mon, 2016-07-25 at 22:53 -0500, Eric W. Biederman wrote:
> >> Wei Jiangang <weijg.fnst@...fujitsu.com> writes:
> >>
> >> > If we specify the 'notsc' parameter for the dump-capture kernel,
> >> > and then trigger a crash(panic) by using "ALT-SysRq-c" or
> >> > "echo c > /proc/sysrq-trigger", the dump-capture kernel will
> >> > hang in calibrate_delay_converge() and wait for jiffies changes.
> >> > serial log as follows:
> >> >
> >> > tsc: Fast TSC calibration using PIT
> >> > tsc: Detected 2099.947 MHz processor
> >> > Calibrating delay loop...
> >> >
> >> > The reason for jiffies not changes is there's no timer interrupt
> >> > passed to dump-capture kernel.
> >> >
> >> > In fact, once kernel panic occurs, the local APIC is disabled
> >> > by lapic_shutdown() in reboot path.
> >> > generly speaking, local APIC state can be initialized by BIOS
> >> > after Power-Up or Reset, which doesn't apply to kdump case.
> >> > so the kernel has to be responsible for initialize the interrupt
> >> > mode properly according the latest status of APIC in bootup path.
> >> >
> >> > An MP operating system is booted under either PIC mode or
> >> > virtual wire mode. Later, the operating system switches to
> >> > symmetric I/O mode as it enters multiprocessor mode.
> >> > Two kinds of virtual wire mode are defined in Intel MP spec:
> >> > virtual wire mode via local APIC or via I/O APIC.
> >> >
> >> > Now we determine the mode of APIC only through a SMP BIOS(MP table).
> >> > That's not enough. It's better to do further check if APIC works
> >> > with effective interrupt mode, and then, do some proper setting.
> >>
> >> Reading through the code let me pause a moment and say:
> >> "Yowzers the interrupt initialization code has gotten hard to follow. It
> >> is now full of indirection with ill defined semantics." pre_vector_init
> >> indeed.
> >>
> >> I will argue this is the wrong fix.
> >>
> >> We really should not have to worry about getting the system functional
> >> in virtual wire mode on a modern system. And looking at the code
> >> someone has done half the work and made it conditional under
> >> acpi_gbl_reduced_hardware.
> >>
> >> Now reduced hardware implies a bit more than we ware talking about but
> >> if there is ACPI apic information we should not need to worry about
> >> external interrupts and can just enable the apics.
> >>
> >> In fact I think having MPtable information is enough for that.
> >
> > In my opinion, The MP table is not to be trusted if system starts
> > without BIOS phrase.
> >
> > The chapter 3.8 System Initial State (MP v1.4 spec) said below:
> > "1. All local APICs are disabled, except for the local APIC of the BSP
> > if the system starts in Virtual Wire Mode."
> >
> > And
> > .....
> > "The BIOS must disable interrupts to all processors and set the APICs to
> > the system initial state before giving control to the operating system.
> > The operating system is responsible for initializing all of the APICs"
> >
> > The dump-capture kernel won't through the BIOS phrase,
> > so there is no guarantee of the interrupt mode of APIC and the status of
> > local APIC.
> >
> > Although the MP table is read-only,
> > the dump-capture kernel uses the MP table which be generated before the
> > first kernel boots up is unreasonable.
> > At least, only checking MP table is not enough.
> > That's the starting point of my patch.
> >
> > You said “this is the wrong fix”,
> > Is there any wrong with the codes or my solution to check the status of
> > APIC in bootup path?
>
> Today in a dump capture kernel it uses the MP table or the ACPI MP
> information to set up the apics.
>
> How I read your patch is you do something clever to get in virtual wire
> mode. AKA a historical PC/AT compatible mode. I think we should just
> skip that mode entirely.
>
> >> So I think what needs to happens is for the apic initialization to get
> >> an overhaul that makes apic initialization the happy path and the other
> >> irq controllers the odd backwards compatibility path. And when we
> >> are done we never run in anything except full apic mode unless the
> >> hardware doesn't support it.
> > The spec requests "An MP operating system is booted under either one of
> > the two PC/AT-compatible modes. Later the operating system switches to
> > Symmetric I/O Mode as it enters multiprocessor mode."
> > And it seems that the latest kernel follows the rule.
> > Does the apic initialization really need to be changed?
>
> For simplicity when MP support was first added, a few parts of early
> bootup were left handled in PC/AT compatible mode. Mostly it is just
> silly.
>
> Unless there is a good reason I think we should strive to remove those
> few moments when the system is using interrupts in PC/AT mode and always
> run interrupts in the final mode we intend to run interrupts in.
> In fact the code under acpi_gbl_reduced_hardware does that for some
> special cases already today.
>
> At the end of the day that should end up with a simpler more reliable
> kernel. Especially for the dump capture kernel.
>
> AKA:
> I am proposing make the code say something like:
> if (pc_at_compatible_pic_mode) {
> do_legacy_pic_mode();
> } else {
> do_mp_mode();
> }
>
> If my memory serves virtual wire mode does not need to be enabled at all
> in mp mode, so if we are coming from mp mode (with no shutdown of the
> apics in the first kernel). Then there will be no information anywhere
> about which apics need to be programmed into apic mode.
"mp mode" you mentioned means "Symmetric I/O Mode", right ?
>
> As such to simplify the kdump process it is the wrong process to go back
> into virtual wire PC/AT compatible pic mode. Because once we remove the
> code from the crashing kernel we won't have enough information to make
> it work. Unfortunately in some cases it is a one way transition.
You ever told me that the only reason we have do any apic shutdown is
bugs in older kernels. and looks like you're inclined to remove the
codes used to shutdown APIC.
That's an import start point you suggest me to enable "Symmetric I/O
Mode" at the bootup stage of kernel, right?
but In my opinion, the kdump-capture kernel don't need to run on MP
system, but only on BSP. generally, we use kdump with nr_cpus=1.
If only one cpu is enabled and used, do we still need to set "Symmetric
I/O Mode" for APIC ? I think the answer is no.
In other words, It's intended that go back into virtual wire PC/AT
compatible pic mode by shutdown APIC.
If there's something wrong with my understanding, Please correct me.
Thanks,
wei
>
> Eric
>
>
Powered by blists - more mailing lists