lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 2 Jul 2024 23:55:24 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "bp@...en8.de" <bp@...en8.de>
CC: "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
	"rafael@...nel.org" <rafael@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
	"mingo@...hat.com" <mingo@...hat.com>, "tglx@...utronix.de"
	<tglx@...utronix.de>, "kirill.shutemov@...ux.intel.com"
	<kirill.shutemov@...ux.intel.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
	"x86@...nel.org" <x86@...nel.org>, "linux-acpi@...r.kernel.org"
	<linux-acpi@...r.kernel.org>
Subject: Re: [PATCH v3] x86/acpi: fix panic while AP online later with kernel
 parameter maxcpus=1

On Tue, 2024-07-02 at 14:45 +0200, Borislav Petkov wrote:
> On Tue, Jul 02, 2024 at 12:05:38PM +0000, Huang, Kai wrote:
> > On Tue, 2024-07-02 at 08:58 +0800, Zhiquan Li wrote:
> > > The issue was found on the platform that using "Multiprocessor Wakeup
> > > Structure"[1] to startup secondary CPU, which is usually used by
> > > encrypted guest.  When restrict boot time CPU to 1 with the kernel
> > > parameter "maxcpus=1" and bring other CPUs online later, there will be
> > > a kernel panic.
> > > 
> > > The variable acpi_mp_wake_mailbox, which holds the virtual address of
> > > the MP Wakeup Structure mailbox, will be set as read-only after init.
> > > If the first AP gets online later, after init, the attempt to update
> > > the variable results in panic.
> > > 
> > > The memremap() call that initializes the variable cannot be moved into
> > > acpi_parse_mp_wake() because memremap() is not functional at that point
> > > in the boot process.
> > > 
> > > [1] Details about the MP Wakeup structure can be found in ACPI v6.4, in
> > >     the "Multiprocessor Wakeup Structure" section.
> > > 
> > > Signed-off-by: Zhiquan Li <zhiquan1.li@...el.com>
> > > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> > 
> > Seems this changelog only mentions the problem, but doesn't say how to fix:
> > 
> >   Remove the __ro_after_init annotation of acpi_mp_wake_mailbox to fix.
> 
> Do not talk about *what* the patch is doing in the commit message - that
> should be obvious from the diff itself. Rather, concentrate on the *why*
> it needs to be done.
> 
> Imagine one fine day you're doing git archeology, you find the place in
> the code about which you want to find out why it was changed the way it 
> is now.
> 
> You do git annotate <filename> ... find the line, see the commit id and
> you do:
> 
> git show <commit id>
> 
> You read the commit message and there's just gibberish and nothing's
> explaining *why* that change was done. And you start scratching your
> head, trying to figure out why. Because the damn commit message is worth
> sh*t.

Yeah fully agree.  Thanks for saying this again.

> 
> This happens to us maintainers at least once a week. Well, I don't want
> that to happen in my tree anymore.
> 
> So none of this text above still doesn't explain to me *why* this is
> happening.
> 
> Why do APs need to update acpi_mp_wake_mailbox?

They don't need to if acpi_mp_wake_mailbox can be setup before smp_init()
once for all.

But currently the setup of acpi_mp_wake_mailbox is done when the first AP is
brought up because memremap() doesn't work in acpi_parse_mp_wake(), as
mentioned in the changelog of this patch.

I also feel it's not ideal to setup acpi_mp_wake_mailbox when bringing up
the first AP, so I provided my diff.  IIUC, if memremap() works for
acpi_mp_wake_mailbox when bringing up the first AP, then it should also work
in
the early_initcall().

> 
> Which patch is this fixing?

It fiexes below commit AFAICT:

  24dd05da8c79 ("x86/apic: Mark acpi_mp_wake_* variables as
__ro_after_init")

Which didn't consider 'maxvcpus=xx' case.


But I will leave to Kirill to confirm.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ