[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <80425222-4441-417e-a53e-1aa692b931d7@redhat.com>
Date: Tue, 26 Mar 2024 23:19:38 -0400
From: Waiman Long <longman@...hat.com>
To: Feng Tang <feng.tang@...el.com>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...el.com>, "H . Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>, x86@...nel.org, paulmck@...nel.org,
rui.zhang@...el.com, linux-kernel@...r.kernel.org
Cc: Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH v3] x86/tsc: Use topology_max_packages() to get package
number
On 3/26/24 22:51, Feng Tang wrote:
> Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC
> on qualified platorms") was introduced to solve problem that
> sometimes TSC clocksource is wrongly judged as unstable by watchdog
> like 'jiffies', HPET, etc.
>
> In it, the hardware package number is a key factor for judging whether
> to disable the watchdog for TSC, and 'nr_online_nodes' was chosen due
> to it is available in early boot phase before registering 'tsc-early'
> clocksource, where all non-boot CPUs are not brought up yet.
>
> Dave and Rui pointed out there are many cases in which 'nr_online_nodes'
> is cheated and not accurate, like:
>
> * numa emulation (numa=fake=8 etc.)
> * numa=off
> * platforms with CPU-less HBM nodes, CPU-less Optane memory nodes.
> * SNC (sub-numa cluster) mode enabled
> * 'maxcpus=' cmdline setup, where chopped CPUs could be onlined later
> * 'nr_cpus=', 'possible_cpus=' cmdline setup, where chopped CPUs can
> not be onlined after boot
>
> Thomas' recent patchset of refactoring x86 topology code improves
> topology_max_packages(), by making it more accurate and available in
> early boot phase, which works well in most of the above cases.
>
> The only exceptions are 'nr_cpus=' and 'possible_cpus=' setup. And
> the reason is, during topology setup, the boot CPU iterates through
> all enumerated APIC ids and either accepts or rejects the APIC id.
> For accepted ids, it figures out which bits of the id map to the
> package number. It tracks which package numbers have been seen in a
> bitmap. topology_max_packages() just returns the number of bits set
> in that bitmap.
>
> 'nr_cpus=' and 'possible_cpus=' can cause more APIC ids to be rejected
> and can artificially lower the number of bits in the package bitmap
> and thus topology_max_packages(). This means that, for example, a
> system with 8 physical packages might reject all the CPUs on 6 of those
> packages and be left with only 2 packages and 2 bits set in the package
> bitmap. It needs the TSC watchdog, but would disable it anyway. This
> isn't ideal, but this only happens for debug-oriented options. This is
> fixable by tracking the package numbers for rejected CPUs. But it's
> not worth the trouble for debugging.
>
> So use topology_max_packages() to replace nr_online_nodes().
>
> Reported-by: Dave Hansen <dave.hansen@...ux.intel.com>
> Closes: https://lore.kernel.org/lkml/a4860054-0f16-6513-f121-501048431086@intel.com/
> Signed-off-by: Feng Tang <feng.tang@...el.com>
> ---
>
> Hi all,
>
> For warning about possible compromise due to 'nr_cpus=' and 'possible_cpus=',
> one alternative is to check whether these has been setup in cmdline inside
> tsc.c and warn there.
>
> Changelog:
>
> Since v2:
> * Use 'pr_info' to replace 'pr_warn' which could panic system
> if 'panic_on_warn=1' kcmdline parameter is on (Waiman)
>
> Since v1:
> * Use Dave's detailed elaboration about 'nr_cpus=', 'possible_cpus='
> possibly compromising '__max_logical_packages' in commit log
> * Fix typos and inaccuracy (Rui and Longman)
>
>
> arch/x86/kernel/cpu/topology.c | 5 ++++-
> arch/x86/kernel/tsc.c | 7 ++-----
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
> index 3259b1d4fefe..9db66597388e 100644
> --- a/arch/x86/kernel/cpu/topology.c
> +++ b/arch/x86/kernel/cpu/topology.c
> @@ -460,8 +460,11 @@ void __init topology_init_possible_cpus(void)
> pr_info("Num. threads per package: %3u\n", __num_threads_per_package);
>
> pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
> - if (topo_info.nr_rejected_cpus)
> + if (topo_info.nr_rejected_cpus) {
> pr_info("Rejected CPUs %u\n", topo_info.nr_rejected_cpus);
> + if (__max_logical_packages <= 4)
> + pr_info("TSC might be buggered due to the rejected CPUs\n");
> + }
>
> init_cpu_present(cpumask_of(0));
> init_cpu_possible(cpumask_of(0));
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 5a69a49acc96..d00f88f16eb1 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1252,15 +1252,12 @@ static void __init check_system_tsc_reliable(void)
> * - TSC which does not stop in C-States
> * - the TSC_ADJUST register which allows to detect even minimal
> * modifications
> - * - not more than two sockets. As the number of sockets cannot be
> - * evaluated at the early boot stage where this has to be
> - * invoked, check the number of online memory nodes as a
> - * fallback solution which is an reasonable estimate.
> + * - not more than four packages
> */
> if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
> - nr_online_nodes <= 4)
> + topology_max_packages() <= 4)
> tsc_disable_clocksource_watchdog();
> }
>
Reviewed-by: Waiman Long <longman@...hat.com>
Powered by blists - more mailing lists