lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161121160653.s4i3nua46rtpvj5l@pd.tnic>
Date:   Mon, 21 Nov 2016 17:06:54 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>
Cc:     Steven Rostedt <rostedt@...dmis.org>, Jiri Olsa <jolsa@...hat.com>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
        Josh Triplett <josh@...htriplett.org>,
        Andi Kleen <andi@...stfloor.org>,
        Jan Stancek <jstancek@...hat.com>
Subject: Re: [BUG] msr-trace.h:42 suspicious rcu_dereference_check() usage!

+ tglx.

On Mon, Nov 21, 2016 at 04:41:04PM +0100, Peter Zijlstra wrote:
> Right, I just wondered about the !c1e present case. In that case we'll
> forever read the msr in the hope of finding that bit set, and we never
> will.

Well, erratum 400 flag is set on everything F10h from model 0x2 onwards,
which is basically everything that's still out there. And we set it for
most of the K8s too. So I don't think there's an affected machine out
there where we don't enable that erratum for so I'd assume it is turned
on everywhere.

IOW, what's the worst thing that can happen if we did this below?

We basically get rid of the detection and switch the timer to broadcast
mode immediately on the halting CPU.

amd_e400_idle() is behind an "if (cpu_has_bug(c, X86_BUG_AMD_APIC_C1E))"
check so it will run on the affected CPUs only...

Thoughts?

---
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 0888a879120f..7ab9d0254b4d 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -354,41 +354,30 @@ void amd_e400_remove_cpu(int cpu)
  */
 static void amd_e400_idle(void)
 {
-	if (!amd_e400_c1e_detected) {
-		u32 lo, hi;
+	int cpu = smp_processor_id();
 
-		rdmsr(MSR_K8_INT_PENDING_MSG, lo, hi);
+	if (!static_cpu_has(X86_FEATURE_NONSTOP_TSC))
+		mark_tsc_unstable("TSC halt in AMD C1E");
 
-		if (lo & K8_INTP_C1E_ACTIVE_MASK) {
-			amd_e400_c1e_detected = true;
-			if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
-				mark_tsc_unstable("TSC halt in AMD C1E");
-			pr_info("System has AMD C1E enabled\n");
-		}
-	}
-
-	if (amd_e400_c1e_detected) {
-		int cpu = smp_processor_id();
+	pr_info("System has AMD C1E enabled\n");
 
-		if (!cpumask_test_cpu(cpu, amd_e400_c1e_mask)) {
-			cpumask_set_cpu(cpu, amd_e400_c1e_mask);
-			/* Force broadcast so ACPI can not interfere. */
-			tick_broadcast_force();
-			pr_info("Switch to broadcast mode on CPU%d\n", cpu);
-		}
-		tick_broadcast_enter();
+	if (!cpumask_test_cpu(cpu, amd_e400_c1e_mask)) {
+		cpumask_set_cpu(cpu, amd_e400_c1e_mask);
+		/* Force broadcast so ACPI can not interfere. */
+		tick_broadcast_force();
+		pr_info("Switch to broadcast mode on CPU%d\n", cpu);
+	}
+	tick_broadcast_enter();
 
-		default_idle();
+	default_idle();
 
-		/*
-		 * The switch back from broadcast mode needs to be
-		 * called with interrupts disabled.
-		 */
-		local_irq_disable();
-		tick_broadcast_exit();
-		local_irq_enable();
-	} else
-		default_idle();
+	/*
+	 * The switch back from broadcast mode needs to be
+	 * called with interrupts disabled.
+	 */
+	local_irq_disable();
+	tick_broadcast_exit();
+	local_irq_enable();
 }
 
 /*

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ