lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230327191026.3454-1-eric.devolder@oracle.com>
Date:   Mon, 27 Mar 2023 15:10:25 -0400
From:   Eric DeVolder <eric.devolder@...cle.com>
To:     rafael@...nel.org, lenb@...nel.org, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        x86@...nel.org, hpa@...or.com, linux-acpi@...r.kernel.org,
        linux-kernel@...r.kernel.org, mario.limonciello@....com,
        kvijayab@....com
Cc:     miguel.luis@...cle.com, boris.ostrovsky@...cle.com,
        eric.devolder@...cle.com
Subject: [PATCH 0/1] x86/acpi: acpi_is_processor_usable() dropping possible cpus

I've been working on a patch series for cpu hotplug and kdump and
noticed recently that utilizing a QEMU guest to test cpu hotplug was
causing the following error messages:

    APIC: NR_CPUS/possible_cpus limit of 30 reached. Processor 30/0x.
    ACPI: Unable to map lapic to logical cpu number
    acpi LNXCPU:1e: Enumeration failure

where prior cpu hotplug would typically result in messages similar to:

    CPU30 has been hot-added
    smpboot: Booting Node 0 Processor 30 APIC 0x1e
    Will online and init hotplugged CPU: 30

The QEMU guest incantation was:

    qemu-system-x86_64 -smp 30,maxcpus=32 ...

A simple guest with 30 cpus and possiblly up to 32, meaning two hot
pluggable.

An investigation lead to finding the following relevant changes:

   commit aa06e20f1be6 ("x86/acpi: Don't add CPUs that are not online capable")
   commit e2869bd7af60 ("x86/acpi/boot: Do not register processors that cannot be onlined for x2APIC")

The first patch codes the fact that ACPI 6.3 now features an Online
Capable bit in MADT.revision >= 5. This Online Capable bit explicitly
indicates that the cpu can be hotplugged.

The second patch refactors the first a bit in order to apply the same
logic for both lapic and x2apic. However, there is a subtle change in
the logic to this patch that breaks for MADT.revision prior to 5.

As it turns out, QEMU reports revision 1 in the MADT table for x86.
(Technically QEMU is at revision 4, and I'll be addressing that with
the QEMU folks soon.)

In looking at these patches and understanding the logic, I can sum it
up by stating that prior to MADT revision 5, the presence of a
lapic/x2apic structure _implicitly_ was used to determine that a cpu
might be hotpluggable, and it would count towards possible cpus. With
MADT.revision >= 5, that implicit assumption is now replaced with an
explicit indication.

The manner in which the latest patch is coded, it essentially requires
the Online Capable bit introduced at MADT.revision >= 5 be set for
present-but-offline cpus, else it excludes the cpu from the possible
cpus. Thus for MADT.revision < 5, possible cpus are being dropped.

This patch aims to restore the behavior for MADT.revision prior to 5
which facilitates again cpu hotplug for QEMU guests.

Regards,
eric
---


Eric DeVolder (1):
  x86/acpi: acpi_is_processor_usable() dropping possible cpus

 arch/x86/kernel/acpi/boot.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

-- 
2.31.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ