[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230417103050.GF83892@hirez.programming.kicks-ass.net>
Date: Mon, 17 Apr 2023 12:30:50 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
David Woodhouse <dwmw@...radead.org>,
Andrew Cooper <andrew.cooper3@...rix.com>,
Brian Gerst <brgerst@...il.com>,
Arjan van de Veen <arjan@...ux.intel.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Paul McKenney <paulmck@...nel.org>,
Tom Lendacky <thomas.lendacky@....com>,
Sean Christopherson <seanjc@...gle.com>,
Oleksandr Natalenko <oleksandr@...alenko.name>,
Paul Menzel <pmenzel@...gen.mpg.de>,
"Guilherme G. Piccoli" <gpiccoli@...lia.com>,
Piotr Gorski <lucjan.lucjanov@...il.com>,
David Woodhouse <dwmw@...zon.co.uk>,
Usama Arif <usama.arif@...edance.com>,
Juergen Gross <jgross@...e.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
xen-devel@...ts.xenproject.org,
Russell King <linux@...linux.org.uk>,
Arnd Bergmann <arnd@...db.de>,
linux-arm-kernel@...ts.infradead.org,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Guo Ren <guoren@...nel.org>,
linux-csky@...r.kernel.org,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
linux-mips@...r.kernel.org,
"James E.J. Bottomley" <James.Bottomley@...senpartnership.com>,
Helge Deller <deller@....de>, linux-parisc@...r.kernel.org,
Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
linux-riscv@...ts.infradead.org,
Mark Rutland <mark.rutland@....com>,
Sabin Rapan <sabrapan@...zon.com>
Subject: Re: [patch 00/37] cpu/hotplug, x86: Reworked parallel CPU bringup
On Sat, Apr 15, 2023 at 01:44:13AM +0200, Thomas Gleixner wrote:
> Background
> ----------
>
> The reason why people are interested in parallel bringup is to shorten
> the (kexec) reboot time of cloud servers to reduce the downtime of the
> VM tenants. There are obviously other interesting use cases for this
> like VM startup time, embedded devices...
...
> There are two issue there:
>
> a) The death by MCE broadcast problem
>
> Quite some (contemporary) x86 CPU generations are affected by
> this:
>
> - MCE can be broadcasted to all CPUs and not only issued locally
> to the CPU which triggered it.
>
> - Any CPU which has CR4.MCE == 0, even if it sits in a wait
> for INIT/SIPI state, will cause an immediate shutdown of the
> machine if a broadcasted MCE is delivered.
When doing kexec, CR4.MCE should already have been set to 1 by the prior
kernel, no?
Powered by blists - more mailing lists