linux-kernel - Re: [PATCH v3 2/5] x86/apic: Fix restoring boot irq mode in reboot and kexec/kdump

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <62686ee8-4d67-753d-3442-5456a2ba7076@cn.fujitsu.com>
Date:   Wed, 14 Feb 2018 11:22:14 +0800
From:   Dou Liyang <douly.fnst@...fujitsu.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
CC:     Baoquan He <bhe@...hat.com>, <linux-kernel@...r.kernel.org>,
        <mingo@...nel.org>, <tglx@...utronix.de>, <x86@...nel.org>,
        <joro@...tes.org>, <uobergfe@...hat.com>, <prarit@...hat.com>
Subject: Re: [PATCH v3 2/5] x86/apic: Fix restoring boot irq mode in reboot
 and kexec/kdump

Hi Eric,

At 02/14/2018 01:40 AM, Eric W. Biederman wrote:
> Dou Liyang <douly.fnst@...fujitsu.com> writes:
> 
>> Hi Baoquan,
>>
>> At 02/12/2018 11:08 AM, Eric W. Biederman wrote:
>>> Baoquan He <bhe@...hat.com> writes:
>>>
>>>> This is a regression fix.
>>>>
>>>> Before, to fix erratum AVR31, commit 522e66464467 ("x86/apic: Disable
>>>> I/O APIC before shutdown of the local APIC") moved lapic_shutdown()
>>>> calling after disable_IO_APIC(). This introdued a regression. The
>>>> root cause is that disable_IO_APIC() not only clears IO_APIC, also
>>>> restore boot irq mode by setting LAPIC/APIC/IMCR, lapic_shutdown()
>>>> after disable_IO_APIC() will disable LAPIC and ruin the possible
>>>> virtual wire mode setting which the code has been trying to do all
>>>> along.
>>>>
>>>> The consequence is, in KVM guest kernel always prints warning as below
>>>> during kexec/kdump kernel boots up. That happened in setup_local_APIC()
>>>> since 'do { xxx } while (queued && max_loops > 0)' loop does not function
>>
>> I am not sure another thing here
>>
>> AFAIK, according to the order of SMP machine shutdown, other CPUs will
>> be stopped firstly, then the last CPU disable its local apic.
>>
>>   --machine_shutdown
>>     |----......
>>     |----stop_other_cpus()
>>     |----local_shutdown()
>>
>> So, the pending interrupts exist only in BSP and only be ACKed by
>> BSP. is it right?(I validated this by print the value of APIC_IRR/ISR of
>> all CPUs, found only BSP had the non-zero value).
> 
> We don't know.  In the kexec on panic case we try and shutdown the other
> cpus but we have a timeout because we might fail.
> 
> Further you have to be careful with the concept of boot cpu.  In the
> normal kexec case shutdown on the boot cpu and leave it running.  In the
> kexec on panic case we shutdown on an arbitrary cpu.
> 
>> If it is right, We will do not need check the pending interrupt for each
>> cpus.
> 
> It is also cheap if there are no pending interrupts as there is nothing
> to do.
> 

Yes, It's cheap.

But, the local APICs in APs were disabled at that time, not sure if
writing to the end-of-interrupt (EOI) register could cause the local
APIC to clear the ISR successfully.

If the ISR was not cleared, doing that check is useless.

I can't produce any cases that the lapics in APs have pending
interrupts. do you have some suggestions?

>> BTW, the pending interrupt check code is mixed with the local
>> APIC setup code, that looks messy. How about extracting the code which
>> related to the crash interrupt check, put it into a new function and
>> only invoked it when the CPU is BSP?
> 
> Moving it into it's own function makes sense.  Let's not taint the
> concept with ``crash''.  We don't know that the only way this will
> ever happen is from kexec on panic.   We only know it was easy to
> trigger the condition from kexec on panic.
> 

Yes, I see.

> There are a lot of cases I can think of that interrupts might fire while
> interrupts are disabled because the kernel is booting.  A normal kexec
> is also possible.
> 
> Throw in MSI interrupts and transitioning from one state to another in
> non-legacy apic mode and we might be more likely to get some irqs in
> a pending state.
> 
> Your patch makes me nervous as it is not just code motion but a much
> more substantial change.
> 
> 
> As much as I agree that we need to fix the regression in the apic
> shutdown code that is causing problems.  What we really need to do
> is to completely remove the apic shutdown code from the kexec on panic
> code path.  Over the long term that will provide us with a much more

Wow, indeed, and it is related to many hypervisors on x86, For me, still
need more investigations and tests.

Thanks,
	dou