lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250901170418.4314-1-kprateek.nayak@amd.com>
Date: Mon, 1 Sep 2025 17:04:14 +0000
From: K Prateek Nayak <kprateek.nayak@....com>
To: Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
	Sean Christopherson <seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>,
	Jonathan Corbet <corbet@....net>, <x86@...nel.org>
CC: Naveen rao <naveen.rao@....com>, Sairaj Kodilkar <sarunkod@....com>, "H.
 Peter Anvin" <hpa@...or.com>, "Peter Zijlstra (Intel)"
	<peterz@...radead.org>, "Xin Li (Intel)" <xin@...or.com>, Pawan Gupta
	<pawan.kumar.gupta@...ux.intel.com>, <linux-kernel@...r.kernel.org>,
	<kvm@...r.kernel.org>, Mario Limonciello <mario.limonciello@....com>,
	"Gautham R. Shenoy" <gautham.shenoy@....com>, Babu Moger
	<babu.moger@....com>, Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
	K Prateek Nayak <kprateek.nayak@....com>
Subject: [PATCH v5 0/4]  x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon

When running an AMD guest on QEMU with > 255 cores, the following FW_BUG
was noticed with recent kernels when topoext feature wasn't explicitly
enabled:

    [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200

QEMU provides the extended topology leaf 0xb for these guests but in an
effort to keep all the topology parsing bits together during the
enablement of the 0xb leaf for AMD, a pseudo dependency on
X86_FETURE_TOPOEXT was created which prevents these guests from parsing
the topology from the 0xb leaf.

The support for CPUID leaf 0xb is independent of the TOPOEXT feature and
is rather linked to the x2APIC enablement. The support for the extended
topology leaves is expected to be confirmed by ensuring:

1. "leaf <= {extended_}cpuid_level" and then
2. Parsing the level 0 of the respective leaf to confirm EBX[15:0]
   (LogProcAtThisLevel) is non-zero

as stated in the definition of "CPUID_Fn0000000B_EAX_x00 [Extended
Topology Enumeration] (Core::X86::Cpuid::ExtTopEnumEax0)" in Processor
Programming Reference (PPR) for AMD Family 19h Model 01h Rev B1 Vol1 [1]
Sec. 2.1.15.1 "CPUID Instruction Functions".

On baremetal, this has not been a problem since TOPOEXT support (Fam
0x15 and above) predates the support for CPUID leaf 0xb (Fam 0x17[Zen2]
and above) however, in virtualized environment, the support for x2APIC
can be enabled independent of topoext where QEMU expects the guest to
parse the topology and the APICID from CPUID leaf 0xb.

Boris asked why QEMU doesn't force enable TOPOEXT feature with x2APIC
[2] and Naveen discovered there were historic reasons to not enable
TOPOEXT by default when using "-cpu host" on AMD systems [3].

The same behavior continues unless an EPYC cpu model is explicitly
passed to QEMU. More details are enclosed in the commit logs.

Ideally, these changes should not affect baremetal AMD/Hygon platforms
as they have supported TOPOEXT long before the support for CPUID leaf
0xb and the extended CPUID leaf 0x80000026 (famous last words).

Patch 2 and 3 are yak shaving to explicitly define a raw MSR value used
in the topology parsing bits and simplify the flow around "has_topoext"
when the same can be discovered using X86_FEATURE_XTOPOLOGY.

Patch 4 is the documentation patch that outlines the preferred parsing
order of CPUID leaves during topology enumeration on x86 platforms.

Previous version of this series has been tested on baremetal Zen1
(contains topoext but not 0xb leaf), Zen3 (contains both topoext and 0xb
leaf), and Zen4 (contains topoext, 0xb leaf, and 0x80000026 leaf)
servers with no changes observed in "/sys/kernel/debug/x86/topo/"
directory.

The series was also tested on 255 and 512 vCPU (each vCPU is an
individual core from QEMU topology being passed) EPYC-Genoa guest with
and without x2apic and topoext enabled and this series solves the FW_BUG
seen on guest with > 255 VCPUs. No changes observed in
"/sys/kernel/debug/x86/topo/" for all other cases without warning.
0xb leaf is provided unconditionally on these guests (with or without
topoext, even with x2apic disabled on guests with <= 255 vCPU).

In all the cases initial_apicid matched the apicid in
"/sys/kernel/debug/x86/topo/" after applying this series.

Relevant bits of QEMU cmdline used during testing are as follows:

    qemu-system-x86_64 \
    -enable-kvm -m 32G -smp cpus=512,cores=512 \
    -cpu EPYC-Genoa,x2apic=on,kvm-msi-ext-dest-id=on,+kvm-pv-unhalt,kvm-pv-tlb-flush,kvm-pv-ipi,kvm-pv-sched-yield,[-topoext]  \
    -machine q35,kernel_irqchip=split \
    -global kvm-pit.lost_tick_policy=discard
    ...

References:

[1] https://bugzilla.kernel.org/show_bug.cgi?id=206537
[2] https://lore.kernel.org/lkml/20250819113447.GJaKRhVx6lBPUc6NMz@fat_crate.local/
[3] https://lore.kernel.org/qemu-devel/20180809221852.15285-1-ehabkost@redhat.com/

Series is based on tip:master at commit 4f0d2af9e565 ("Merge branch into
tip/master: 'x86/tdx'")

---
Changelog v4..v5:

o Dropped the patch that was merged.

o Addressed review comments by Boris on Patch 1.

o Included the documentation patch formally.

v4: https://lore.kernel.org/lkml/20250825075732.10694-1-kprateek.nayak@amd.com/

Changelog v3..v4:

o Renamed the series title to better capture the purpose. Based on the
  readout of the APM and PPR, this problem was only exposed by QEMU
  and QEMU is not doing anything wrong considering the spec.

o Fixed references to X86_FEATURE_XTOPOLOGY (XTOPOLOGY) which was
  mistakenly referred to as XTOPOEXT. (Boris)

o Reordered the patches to have the fixes before cleanups. (Thomas)

o Refreshed the diff of Patch 1 with the one Thomas suggested in
  https://lore.kernel.org/lkml/87ms7o3kn6.ffs@tglx/. (Thomas)

o Quoted the relevant sections of the APM and the PPR to support the
  changes. (Mentioned on v3 by Naveen and Boris)

Note: The debate on "CoreId" from CPUID 0x8000001e EBX has not been
addressed yet. I'll check internally and follow up on the QEMU bits once
H/W folks confirm what their strategy is with the 8-bit field in future
processors.

The updates in this series ensures the usage of the topology information
from the XTOPOLOGY leaves (0x80000026 / 0xb)  when they are present and
systems that support more than 256 CPUs need x2APIC enabled to address
all the CPUs present thus removing the dependency on CPUID leaf
0x8000001e for Core ID.

v3: https://lore.kernel.org/lkml/20250818060435.2452-1-kprateek.nayak@amd.com/

Changelog v2..v3:

o Patch 1 was added to the series.
o Use cpu_feature_enabled() in Patch 3.
o Rebased on top of tip:x86/cpu.

v2: https://lore.kernel.org/lkml/20250725110622.59743-1-kprateek.nayak@amd.com/

Changelog v1..v2:

o Collected tags from Naveen. (Thank you for testing!)
o Rebased the series on tip:x86/cpu.
o Swapped Patch 1 and Patch 2 from v1.
o Merged the body of two if blocks in Patch 1 to allow for cleanup in
  Patch 3.

v1: https://lore.kernel.org/lkml/20250612072921.15107-1-kprateek.nayak@amd.com/
---
K Prateek Nayak (4):
  x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon
  x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing
    has_xtopology
  x86/msr-index: Define AMD64_CPUID_FN_EXT MSR
  Documentation/x86/topology: Detail CPUID leaves used for topology
    enumeration

 Documentation/arch/x86/topology.rst | 198 ++++++++++++++++++++++++++++
 arch/x86/include/asm/msr-index.h    |   5 +
 arch/x86/kernel/cpu/topology_amd.c  |  39 +++---
 3 files changed, 223 insertions(+), 19 deletions(-)


base-commit: 4f0d2af9e56558e125b321b176b25cd6ad5fdac7
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ