linux-kernel - Re: [Patch v2 2/2] x86/tsc: use logical_packages as a better estimation of socket numbers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZJhUiO+bdBoLU5WF@feng-clx>
Date:   Sun, 25 Jun 2023 22:51:52 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        David Woodhouse <dwmw@...zon.co.uk>,
        "Paul E . McKenney" <paulmck@...nel.org>, <x86@...nel.org>,
        <linux-kernel@...r.kernel.org>, <rui.zhang@...el.com>,
        <tim.c.chen@...el.com>
Subject: Re: [Patch v2 2/2] x86/tsc: use logical_packages as a better
 estimation of socket numbers

On Fri, Jun 23, 2023 at 11:04:34PM +0800, Feng Tang wrote:
> Hi Thomas,
> 
> On Fri, Jun 23, 2023 at 01:07:24AM +0200, Thomas Gleixner wrote:
> > On Thu, Jun 22 2023 at 16:27, Thomas Gleixner wrote:
> > > On Fri, Jun 16 2023 at 15:18, Feng Tang wrote:
> > > So something like the below should just work.
> > 
> > Well it works in principle, but does not take any of the command line
> > parameters which limit nr_possible CPUs or the actual kernel
> > configuration into account. But the principle itself works correctly.
> > 
> > Below is an updated version, which takes them into account.
> > 
> > The data here is from a two socket system with 32 CPUs per socket.
> > 
> > No command line parameters (NR_CPUS=64):
> > 
> >  smpboot: Allowing 64 CPUs, 32 hotplug CPUs
> >  clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1e3306b9ada, max_idle_ns: 440795224413 ns
> >  smp: Brought up 1 node, 32 CPUs
> >  smpboot: Max logical packages ACPI enumeration: 2
> > 
> > "possible_cpus=32" (NR_CPUS=64) or
> > No command line parameter (NR_CPUS=32):
> > 
> >  smpboot: Allowing 32 CPUs, 0 hotplug CPUs
> >  clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1e3306b9ada, max_idle_ns: 440795224413 ns
> >  smp: Brought up 1 node, 32 CPUs
> >  smpboot: Max logical packages ACPI enumeration: 1
> > 
> > maxcpus=32
> >  smpboot: Allowing 64 CPUs, 0 hotplug CPUs
> >  clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1e3306b9ada, max_idle_ns: 440795224413 ns
> >  smp: Brought up 1 node, 32 CPUs
> >  smpboot: Max logical packages ACPI enumeration: 2
> > 
> > But that's really all we should do. If the ACPI table enumerates CPUs as
> > hotpluggable which can never arrive, then so be it.
> > 
> > We have enough parameters to override the BIOS nonsense. Trying to do
> > more magic MAD table parsing with heuristics is just wrong.
> > 
> > We already have way too many heuristics and workarounds for broken
> > firmware, but for the problem at hand, we really don't need more.
> > 
> > The only systems I observed so far which have a non-sensical amount of
> > "hotpluggable" CPUs are high-end server machines. It's a resonable
> > expectation that machines with high-end price tags come with correct
> > firmware. Trying to work around that (except with the existing command
> > line options) is just proliferating this mess. This has to stop.
> > 
> > Thanks,
> > 
> >         tglx
> 
> Thanks for helping on this.
> 
> I run some tests with your patch againt latest kernel, and found with
> some "maxcpus=" setup, the kernel will soft hung, that it will print
> some hung/stall message from time to time.
> 
> My test machine is Cascacade Lake AP, 2 packages (4 NUMA nodes), 96C
> and 192T. The cmdline is "maxcpus=24", and 24 is the number of core
> per NUMA node. Don't know if you can reproduce it with "maxcpus=16"
> on your test box.
> 
> The box is in remote lab and I don't have serial console, but a remote
> console, and I took 2 pictures of the error message (attched).
> 
> Also I will check more on how to debug on this remote machine.
 
[ Above mail was auto-rejected by many mail servers  due to the big size
 of the pictures ]

>From debug, the reason of the hung/stall is detect_extended_topology_early()
is called by cpu hotplug after boot, where there is "maxcpus=XXX" setting,
(#echo 1 > /sys/devices/system/cpu/cpuX/online).

It could be fixed with below patch:
----------------------------------------------------------------
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 828c1f7edac1..1ff73c8c4972 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -29,7 +29,7 @@ unsigned int __max_die_per_package __read_mostly = 1;
 EXPORT_SYMBOL(__max_die_per_package);
 
 #ifdef CONFIG_SMP
-unsigned int apic_to_pkg_shift __ro_after_init;
+unsigned int apic_to_pkg_shift;
 
 /*
  * Check if given CPUID extended topology "leaf" is implemented

----------------------------------------------------------------

I also tested 'numa=off' and 'numa=fake=8' cmdline parameter on one
2 package Cascad Lake SP and one 2 package (4 NUMA nodes) Cascade
Lake AP, and the code works fine by giving the _correct_ estimation:
  
  "smpboot: Max logical packages ACPI enumeration: 2"

Thanks,
Feng

> Thanks,
> Feng