lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1437038237-16741-1-git-send-email-joro@8bytes.org>
Date:	Thu, 16 Jul 2015 11:17:17 +0200
From:	Joerg Roedel <joro@...tes.org>
To:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>
Cc:	x86@...nel.org, Borislav Petkov <bp@...en8.de>,
	linux-kernel@...r.kernel.org, Joerg Roedel <jroedel@...e.de>
Subject: [PATCH] x86/smpboot: Check for cpu_active on cpu initialization

From: Joerg Roedel <jroedel@...e.de>

Currently the code to bring up secondary CPUs only checks
for cpu_online before it proceeds with launching the per-cpu
threads for the freshly booted remote CPU.

But the code to move these threads to the new CPU checks for
cpu_active to do so. If this check fails the threads end up
on the wrong CPU, causing warnings and bugs like:

	WARNING: CPU: 0 PID: 1 at ../kernel/workqueue.c:4417 workqueue_cpu_up_callback

and/or:

	kernel BUG at ../kernel/smpboot.c:135!

The reason is that the cpu_active bit for the new CPU
becomes visible significantly later than the cpu_online bit.
The reasons could be that the kernel runs in a KVM guest,
where the vCPU thread gets preempted when the cpu_online bit
is set, but with cpu_active still clear.

But this could also happen on bare-metal systems with lots
of CPUs. We have observed this issue on an 88 core x86
system on bare-metal.

To fix this issue, wait before the remote CPU is online
*and* active before launching the per-cpu threads.

Signed-off-by: Joerg Roedel <jroedel@...e.de>
---
 arch/x86/kernel/smpboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index d3010aa..30b7b8b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1006,7 +1006,7 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
 	check_tsc_sync_source(cpu);
 	local_irq_restore(flags);
 
-	while (!cpu_online(cpu)) {
+	while (!cpu_online(cpu) || !cpu_active(cpu)) {
 		cpu_relax();
 		touch_nmi_watchdog();
 	}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ