lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon,  4 Dec 2017 11:45:21 -0500
From:   Prarit Bhargava <prarit@...hat.com>
To:     linux-kernel@...r.kernel.org
Cc:     Prarit Bhargava <prarit@...hat.com>, Prarit@...r.kernel.org,
        Jakub Kicinski <kubakici@...pl>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Clark Williams <williams@...hat.com>
Subject: Re: [bisected] x86 boot still broken on -rc2

On 12/04/2017 08:13 AM, Prarit Bhargava wrote:
> 
> 
> x86: Booting SMP configuration:
> .... node  #0, CPUs:        #1  #2  #3  #4
> .... node  #1, CPUs:    #5  #6  #7  #8  #9
> .... node  #0, CPUs:   #10 #11 #12 #13 #14
> .... node  #1, CPUs:   #15 #16 #17 #18 #19
> smp: Brought up 2 nodes, 20 CPUs
> smpboot: Max logical packages: 1
> 
> which means that the calculation of logical packages is wrong because
> 
>       ncpus = cpu_data(0).booted_cores * smp_num_siblings;
>       ncpus = 10 * 2;
>       ncpus = 20;
> 
> smp_num_siblings is defined as "The number of threads in a core" which
> should be 1 if HT/SMT is disabled.
> 
> It looks like my patch has exposed a bug in the
> smp_num_siblings calculation.   I'm still debugging ...

The bug is that smp_num_siblings has been incorrectly calculated as the
*maximum* number of threads in a core, and not the actual number of threads in
a core on systems which have a CPUID level greater than 0xb.  (see
arch/x86/kernel/cpu/topology.c:59)

That will take some time to investigate and come up with a proper solution and
fix.  In the meantime, the patch below will fix the problem in the short-term.
I've tested the patch using SMT enabled, SMT disabled, maxcpus=1 and nr_cpus=1.

tglx, Please revert b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages
estimate") if you think that is a better option.  The problem with
smp_num_siblings has been around for almost a decade.

P.

---8<---

Subject: [PATCH] arch/x86: Do not use smp_num_siblings in
 __max_logical_packages calculation

Documentation/x86/topology.txt defines smp_num_siblings as "The number of
threads in a core".  Since commit bbb65d2d365e ("x86: use cpuid vector 0xb
when available for detecting cpu topology") smp_num_siblings is the
maximum number of threads in a core.  If Simultaneous MultiThreading
(SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as
expected.

Use topology_max_smt_threads() in the __max_logical_packages calculation.

Signed-off-by: Prarit Bhargava <prarit@...hat.com
Cc: Jakub Kicinski <kubakici@...pl>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Clark Williams <williams@...hat.com>
---
 arch/x86/kernel/smpboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3d01df7d7cf6..eaee15fb7d8b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1304,7 +1304,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
 	 * Today neither Intel nor AMD support heterogenous systems so
 	 * extrapolate the boot cpu's data to all packages.
 	 */
-	ncpus = cpu_data(0).booted_cores * smp_num_siblings;
+	ncpus = cpu_data(0).booted_cores * topology_max_smt_threads();
 	__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
 	pr_info("Max logical packages: %u\n", __max_logical_packages);
 
-- 
1.8.3.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ