lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f27e4b3f858890c657df9a7d6f34dc2d60b89757.camel@intel.com>
Date:   Fri, 21 Oct 2022 23:00:44 +0800
From:   Zhang Rui <rui.zhang@...el.com>
To:     Feng Tang <feng.tang@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...el.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Cc:     tim.c.chen@...el.com, Xiongfeng Wang <wangxiongfeng2@...wei.com>,
        liaoyu15@...wei.com
Subject: Re: [PATCH v1 1/2] x86/tsc: use logical_package as a better
 estimation of socket numbers

On Fri, 2022-10-21 at 14:21 +0800, Feng Tang wrote:
> Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC
> on qualified platorms") was introduced to solve problem that
> sometimes TSC clocksource is wrongly judged as unstable by watchdog
> like 'jiffies', HPET, etc.
> 
> In it, the hardware socket number is a key factor for judging
> whether to disable the watchdog for TSC, and 'nr_online_nodes' was
> chosen as an estimation due to it is needed in early boot phase
> before registering 'tsc-early' clocksource, where all none-boot
> CPUs are not brought up yet.
> 
> In recent patch review, Dave Hansen pointed out there are many
> cases that 'nr_online_nodes' could have issue, like:
> * numa emulation (numa=fake=4 etc.)
> * numa=off
> * platforms with CPU+DRAM nodes, CPU-less HBM nodes, CPU-less
>   persistent memory nodes.
> * SNC (sub-numa cluster) mode is enabled
> 
> Peter Zijlstra suggested to use logical package ids, but it is
> only usable after smp_init() and all CPUs are initialized.
> 
> One solution is to skip the watchdog for 'tsc-early' clocksource,
> and move the check after smp_init(), while before 'tsc'
> clocksoure is registered, where 'logical_packages' could be used
> as a much more accurate socket number.
> 
> Signed-off-by: Feng Tang <feng.tang@...el.com>
> ---
> Hi reviewers,
> 
> I separate the code to 2 patches, as I think they are covering 2
> problems and easy for bisect. Feel free to combine them into one,
> as the 2/2 are a trivial change.
> 
> Thanks,
> Feng
> 
> Changelog:
>  
>  Since RFC:
>  * use 'logical_packages' instead of topology_max_packages(), whose
>    implementaion is not accurate, like for heterogeneous systems
>    which have combination of Core/Atom CPUs like Alderlake (Dave
> Hansen)

I checked the history of '__max_logical_packages', and realized that

1. for topology_max_packages()/'__max_logical_packages', the divisor
   'ncpus' uses cpu_data(0).booted_cores, which is based on the
   *online* CPUs. So when using kernel cmdlines like maxcpus=/nr_cpus=,
   '__max_logical_packages' can get over-estimated.

2. for 'logical_packages', it equals the number of different physical
   Package IDs for all *online* CPUs. So with kernel cmdlines like
   nr_cpus=/maxcpus=, it can gets under-estimated.

BTW, I also checked CPUID.B/1F, which can tell a fixed number of CPUs
within a package. But we don't have a fixed number of total CPUs from
hardware.
On my Dell laptop, BIOS allows me to disable/enable one or several
cores. When this happens, the 'total_cpus' changes, but CPUID.B/1F does
not change. So I don't think CPUID.B/1F can be used to optimize the '__
max_logical_packages' calculation.

I'm not sure if we have a perfect solution here.

thanks,
rui






Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ