[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b716b14c52a89c254108df1cee85306c786c52b.camel@intel.com>
Date: Tue, 2 Jul 2024 23:55:24 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "bp@...en8.de" <bp@...en8.de>
CC: "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"rafael@...nel.org" <rafael@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
"mingo@...hat.com" <mingo@...hat.com>, "tglx@...utronix.de"
<tglx@...utronix.de>, "kirill.shutemov@...ux.intel.com"
<kirill.shutemov@...ux.intel.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"x86@...nel.org" <x86@...nel.org>, "linux-acpi@...r.kernel.org"
<linux-acpi@...r.kernel.org>
Subject: Re: [PATCH v3] x86/acpi: fix panic while AP online later with kernel
parameter maxcpus=1
On Tue, 2024-07-02 at 14:45 +0200, Borislav Petkov wrote:
> On Tue, Jul 02, 2024 at 12:05:38PM +0000, Huang, Kai wrote:
> > On Tue, 2024-07-02 at 08:58 +0800, Zhiquan Li wrote:
> > > The issue was found on the platform that using "Multiprocessor Wakeup
> > > Structure"[1] to startup secondary CPU, which is usually used by
> > > encrypted guest. When restrict boot time CPU to 1 with the kernel
> > > parameter "maxcpus=1" and bring other CPUs online later, there will be
> > > a kernel panic.
> > >
> > > The variable acpi_mp_wake_mailbox, which holds the virtual address of
> > > the MP Wakeup Structure mailbox, will be set as read-only after init.
> > > If the first AP gets online later, after init, the attempt to update
> > > the variable results in panic.
> > >
> > > The memremap() call that initializes the variable cannot be moved into
> > > acpi_parse_mp_wake() because memremap() is not functional at that point
> > > in the boot process.
> > >
> > > [1] Details about the MP Wakeup structure can be found in ACPI v6.4, in
> > > the "Multiprocessor Wakeup Structure" section.
> > >
> > > Signed-off-by: Zhiquan Li <zhiquan1.li@...el.com>
> > > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> >
> > Seems this changelog only mentions the problem, but doesn't say how to fix:
> >
> > Remove the __ro_after_init annotation of acpi_mp_wake_mailbox to fix.
>
> Do not talk about *what* the patch is doing in the commit message - that
> should be obvious from the diff itself. Rather, concentrate on the *why*
> it needs to be done.
>
> Imagine one fine day you're doing git archeology, you find the place in
> the code about which you want to find out why it was changed the way it
> is now.
>
> You do git annotate <filename> ... find the line, see the commit id and
> you do:
>
> git show <commit id>
>
> You read the commit message and there's just gibberish and nothing's
> explaining *why* that change was done. And you start scratching your
> head, trying to figure out why. Because the damn commit message is worth
> sh*t.
Yeah fully agree. Thanks for saying this again.
>
> This happens to us maintainers at least once a week. Well, I don't want
> that to happen in my tree anymore.
>
> So none of this text above still doesn't explain to me *why* this is
> happening.
>
> Why do APs need to update acpi_mp_wake_mailbox?
They don't need to if acpi_mp_wake_mailbox can be setup before smp_init()
once for all.
But currently the setup of acpi_mp_wake_mailbox is done when the first AP is
brought up because memremap() doesn't work in acpi_parse_mp_wake(), as
mentioned in the changelog of this patch.
I also feel it's not ideal to setup acpi_mp_wake_mailbox when bringing up
the first AP, so I provided my diff. IIUC, if memremap() works for
acpi_mp_wake_mailbox when bringing up the first AP, then it should also work
in
the early_initcall().
>
> Which patch is this fixing?
It fiexes below commit AFAICT:
24dd05da8c79 ("x86/apic: Mark acpi_mp_wake_* variables as
__ro_after_init")
Which didn't consider 'maxvcpus=xx' case.
But I will leave to Kirill to confirm.
Powered by blists - more mailing lists