lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 17 Dec 2021 00:13:16 +0000
From:   David Woodhouse <dwmw2@...radead.org>
To:     Tom Lendacky <thomas.lendacky@....com>,
        Thomas Gleixner <tglx@...utronix.de>
Cc:     Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "H . Peter Anvin" <hpa@...or.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "rcu@...r.kernel.org" <rcu@...r.kernel.org>,
        "mimoja@...oja.de" <mimoja@...oja.de>,
        "hewenliang4@...wei.com" <hewenliang4@...wei.com>,
        "hushiyuan@...wei.com" <hushiyuan@...wei.com>,
        "luolongjun@...wei.com" <luolongjun@...wei.com>,
        "hejingxian@...wei.com" <hejingxian@...wei.com>
Subject: Re: [PATCH v3 0/9] Parallel CPU bringup for x86_64

On Thu, 2021-12-16 at 16:52 -0600, Tom Lendacky wrote:
> On baremetal, I haven't seen an issue. This only seems to have a problem 
> with Qemu/KVM.
> 
> With 191f08997577 I could boot without issues with and without the 
> no_parallel_bringup. Only after I applied e78fa57dd642 did the failure happen.
> 
> With e78fa57dd642 I could boot 64 vCPUs pretty consistently, but when I 
> jumped to 128 vCPUs it failed again. When I moved the series to 
> df9726cb7178, then 64 vCPUs also failed pretty consistently.
> 
> Strange thing is it is random. Sometimes (rarely) it works on the first 
> boot and then sometimes it doesn't, at which point it will reset and 
> reboot 3 or 4 times and then make it past the failure and fully boot.

Hm, some of that is just artifacts of timing, I'm sure. But now I'm
staring at the way that early_setup_idt() can run in parallel on all
CPUs, rewriting bringup_idt_descr and loading it.

To start with, let's try unlocking the trampoline_lock much later,
after cpu_init_exception_handling() has loaded the real IDT. 

I think we can probably make secondaries load the real IDT early and
never use bringup_idt_descr at all, can't we? But let's see if this
makes it go away, to start with...

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0cd6373bc3f2..2307f7575ab4 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -59,7 +59,7 @@
 #include <asm/cpu_device_id.h>
 #include <asm/uv/uv.h>
 #include <asm/sigframe.h>
-
+#include <asm/realmode.h>
 #include "cpu.h"
 
 u32 elf_hwcap2 __read_mostly;
@@ -2060,6 +2060,7 @@ void cpu_init_secondary(void)
 	 * on this CPU in cpu_init_exception_handling().
 	 */
 	cpu_init_exception_handling();
+	clear_bit(0, (unsigned long *)trampoline_lock);
 	cpu_init();
 }
 #endif
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 3e4c3c416bce..db01b56574cd 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -273,14 +273,6 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 */
 	movq initial_stack(%rip), %rsp
 
-	/* Drop the realmode protection. For the boot CPU the pointer is NULL! */
-	movq	trampoline_lock(%rip), %rax
-	testq	%rax, %rax
-	jz	.Lsetup_idt
-	lock
-	btrl	$0, (%rax)
-
-.Lsetup_idt:
 	/* Setup and Load IDT */
 	pushq	%rsi
 	call	early_setup_idt

Download attachment "smime.p7s" of type "application/pkcs7-signature" (5174 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ