lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR21MB168885D3FB297CA82AB70C8BD711A@BYAPR21MB1688.namprd21.prod.outlook.com>
Date:   Sat, 12 Aug 2023 13:51:50 +0000
From:   "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>
CC:     "x86@...nel.org" <x86@...nel.org>,
        Tom Lendacky <thomas.lendacky@....com>,
        Andrew Cooper <andrew.cooper3@...rix.com>,
        Arjan van de Ven <arjan@...ux.intel.com>,
        Huang Rui <ray.huang@....com>, Juergen Gross <jgross@...e.com>,
        Dimitri Sivanich <dimitri.sivanich@....com>,
        Sohil Mehta <sohil.mehta@...el.com>,
        K Prateek Nayak <kprateek.nayak@....com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Zhang Rui <rui.zhang@...el.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Feng Tang <feng.tang@...el.com>,
        Andy Shevchenko <andy@...radead.org>
Subject: RE: [patch 00/53] x86/topology: The final installment

From: Thomas Gleixner <tglx@...utronix.de> Sent: Monday, August 7, 2023 6:53 AM
> 
> Hi!
> 
> This is the (for now) last part of reworking topology enumeration and
> management. It's based on the APIC and CPUID rework series which can be
> found here:
> 
> https://lore.kernel.org/lkml/20230802101635.459108805@linutronix.de/
> 
> With these preparatory changes in place, it's now possible to address the
> real issues of the current topology code:
> 
>   - Wrong core count on hybrid systems
> 
>   - Heuristics based size information for packages and dies which
>     are failing to work correctly with certain command line parameters.
> 
>   - Full evaluation fail for a theoretical hybrid system which boots
>     from an E-core
> 
>   - The complete insanity of manipulating global data from firmware parsers
>     or the XEN/PV fake SMP enumeration. The latter is really a piece of art.
> 
> This series addresses this by
> 
>   - Mopping up some more historical technical debt
> 
>   - Consolidating all topology relevant functionality into one place
> 
>   - Providing separate interfaces for boot time and ACPI hotplug operations
> 
>   - A sane ordering of command line options and restrictions
> 
>   - A sensible way to handle the BSP problem in kdump kernels instead of
>     the unreliable command line option.
> 
>   - Confinement of topology relevant variables by replacing the XEN/PV SMP
>     enumeration fake with something halfways sensible.
> 
>   - Evaluation of sizes by analysing the topology via the CPUID provided
>     APIC ID segmentation and the actual APIC IDs which are registered at
>     boot time.
> 
>   - Removal of heuristics and broken size calculations
> 
> The idea behind this is the following:
> 
> The APIC IDs describe the system topology in multiple domain levels. The
> CPUID topology parser provides the information which part of the APIC ID is
> associated to the individual levels (Intel terminology):
> 
>    [ROOT][PACKAGE][DIE][TILE][MODULE][CORE][THREAD]
> 
> The root space contains the package (socket) IDs. Not enumerated levels
> consume 0 bits space, but conceptually they are always represented. If
> e.g. only CORE and THREAD levels are enumerated then the DIE, MODULE and
> TILE have the same physical ID as the PACKAGE.
> 
> If SMT is not supported, then the THREAD domain is still used. It then
> has the same physical ID as the CORE domain and is the only child of
> the core domain.
> 
> This allows an unified view on the system independent of the enumerated
> domain levels without requiring any conditionals in the code.
> 
> AMD does only expose 4 domain levels with obviously different terminology,
> but that can be easily mapped into the Intel variant with a trivial lookup
> table added to the CPUID parser.
> 
> The resulting topology information of an ADL hybrid system with 8 P-Cores
> and 8 E-Cores looks like this:
> 
>  CPU topo: Max. logical packages:   1
>  CPU topo: Max. logical dies:       1
>  CPU topo: Max. dies per package:   1
>  CPU topo: Max. threads per core:   2
>  CPU topo: Num. cores per package:    16
>  CPU topo: Num. threads per package:  24
>  CPU topo: Allowing 24 present CPUs plus 0 hotplug CPUs
>  CPU topo: Thread    :    24
>  CPU topo: Core      :    16
>  CPU topo: Module    :     1
>  CPU topo: Tile      :     1
>  CPU topo: Die       :     1
>  CPU topo: Package   :     1
> 
> This is happening on the boot CPU before any of the APs is started and
> provides correct size information right from the start.
> 
> Even the XEN/PV trainwreck makes use of this now. On Dom0 it utilizes the
> MADT and on DomU it provides fake APIC IDs, which combined with the
> provided CPUID information make it at least look halfways realistic instead
> of claiming to have one CPU per package as the current upstream code does.
> 
> This is solely addressing the core topology issues, but there is a plan for
> further consolidation of other topology related information into one single
> source of information instead of having a gazillion of localized special
> parsers and representations all over the place. There are quite some other
> things which can be simplified on top of this, like updating the various
> cpumasks during CPU bringup, but that's all left for later.
> 
> So another 53 patches later, the resulting diffstat is:
> 
>    64 files changed, 830 insertions(+), 955 deletions(-)
> 
> and the combo diffstat of all three series combined:
> 
>   115 files changed, 2414 insertions(+), 3035 deletions(-)
> 
> The current series applies on top of
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cpuid-v3
> 
> and is available from git here:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-full-v1
> 
> Thanks,
> 
> 	tglx

Tested the full series on Hyper-V VMs on Intel and AMD Zen processors.
Tested with hyper-threading enabled and disabled, and with a variety of
NUMA and L3 cache configurations.  All looks good, modulo the known
issue with Hyper-V providing incorrect APIC IDs in some NUMA configs,
but this patch series did not make that problem any worse.

Tested-by: Michael Kelley <mikelley@...rosoft.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ