linux-kernel - Re: [patch 00/37] cpu/hotplug, x86: Reworked parallel CPU bringup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5f5c9395-f9e0-cb9c-4929-cc0134f9b895@citrix.com>
Date:   Mon, 17 Apr 2023 11:44:06 +0100
From:   Andrew Cooper <andrew.cooper3@...rix.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        David Woodhouse <dwmw2@...radead.org>,
        Brian Gerst <brgerst@...il.com>,
        Arjan van de Veen <arjan@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Paul McKenney <paulmck@...nel.org>,
        Tom Lendacky <thomas.lendacky@....com>,
        Sean Christopherson <seanjc@...gle.com>,
        Oleksandr Natalenko <oleksandr@...alenko.name>,
        Paul Menzel <pmenzel@...gen.mpg.de>,
        "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
        Piotr Gorski <lucjan.lucjanov@...il.com>,
        David Woodhouse <dwmw@...zon.co.uk>,
        Usama Arif <usama.arif@...edance.com>,
        Juergen Gross <jgross@...e.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        xen-devel@...ts.xenproject.org,
        Russell King <linux@...linux.org.uk>,
        Arnd Bergmann <arnd@...db.de>,
        linux-arm-kernel@...ts.infradead.org,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>, Guo Ren <guoren@...nel.org>,
        linux-csky@...r.kernel.org,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        linux-mips@...r.kernel.org,
        "James E.J. Bottomley" <James.Bottomley@...senpartnership.com>,
        Helge Deller <deller@....de>, linux-parisc@...r.kernel.org,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        linux-riscv@...ts.infradead.org,
        Mark Rutland <mark.rutland@....com>,
        Sabin Rapan <sabrapan@...zon.com>
Subject: Re: [patch 00/37] cpu/hotplug, x86: Reworked parallel CPU bringup

On 17/04/2023 11:30 am, Peter Zijlstra wrote:
> On Sat, Apr 15, 2023 at 01:44:13AM +0200, Thomas Gleixner wrote:
>
>> Background
>> ----------
>>
>> The reason why people are interested in parallel bringup is to shorten
>> the (kexec) reboot time of cloud servers to reduce the downtime of the
>> VM tenants. There are obviously other interesting use cases for this
>> like VM startup time, embedded devices...
> ...
>
>>   There are two issue there:
>>
>>     a) The death by MCE broadcast problem
>>
>>        Quite some (contemporary) x86 CPU generations are affected by
>>        this:
>>
>>          - MCE can be broadcasted to all CPUs and not only issued locally
>>            to the CPU which triggered it.
>>
>>          - Any CPU which has CR4.MCE == 0, even if it sits in a wait
>>            for INIT/SIPI state, will cause an immediate shutdown of the
>>            machine if a broadcasted MCE is delivered.
> When doing kexec, CR4.MCE should already have been set to 1 by the prior
> kernel, no?

No(ish).  Purgatory can't take #MC, or NMIs for that matter.

It's cleaner to explicitly disable CR4.MCE and let the system reset
(with all the MC banks properly preserved), than it is to take #MC while
the IDT isn't in sync with the handlers, and wander off into the weeds.

~Andrew