lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081215223310.GX25779@one.firstfloor.org>
Date:	Mon, 15 Dec 2008 23:33:10 +0100
From:	Andi Kleen <andi@...stfloor.org>
To:	Andreas Herrmann <andreas.herrmann3@....com>
Cc:	Andi Kleen <andi@...stfloor.org>, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] x86: re-enable MCE on secondary CPUS after suspend/resume

> void __cpuinit mcheck_init(struct cpuinfo_x86 *c)
> {
>         static cpumask_t mce_cpus = CPU_MASK_NONE;
> 
>         mce_cpu_quirks(c);
> 
>         if (mce_dont_init ||
>             cpu_test_and_set(smp_processor_id(), mce_cpus) ||
>             !mce_available(c))
>   =>             return;
> 
>         mce_init(NULL);
>         mce_cpu_features(c);
> }
> 
> But we need to call mce_init to clear all MCE state.
> IMHO the best location to call mce_init for APs is the cpu notifier.

Ah got it. Thanks that makes sense. 

But I think the better fix is to just drop the mce_cpus check and
then handly it naturally in the standard bootup path. I'm not
sure what it was good for anyways. I copied it into the 64bit code
from 32bit, but I suppose even there it isn't really needed and on 
32bit it is already gone even.

How about this patch. Does it fix the problem for you too?

-Andi

--


Don't prevent multiple initialization of MCEs.

Back from early prehistory mcheck_init() has a reentry check. Presumably
that was needed in very old kernels to prevent it entering twice.

But as Andreas points out this prevents CPU hotplug (and therefore resume)
to correctly reinitialize MCEs when a AP boots again after being
offlined. 

Just drop the check.

Based on a report from Andreas Herrmann

Signed-off-by: Andi Kleen <ak@...ux.intel.com>

---
 arch/x86/kernel/cpu/mcheck/mce_64.c |    3 ---
 1 file changed, 3 deletions(-)

Index: linux/arch/x86/kernel/cpu/mcheck/mce_64.c
===================================================================
--- linux.orig/arch/x86/kernel/cpu/mcheck/mce_64.c	2008-12-15 23:13:02.000000000 +0100
+++ linux/arch/x86/kernel/cpu/mcheck/mce_64.c	2008-12-15 23:13:44.000000000 +0100
@@ -510,12 +510,9 @@
  */
 void __cpuinit mcheck_init(struct cpuinfo_x86 *c)
 {
-	static cpumask_t mce_cpus = CPU_MASK_NONE;
-
 	mce_cpu_quirks(c);
 
 	if (mce_dont_init ||
-	    cpu_test_and_set(smp_processor_id(), mce_cpus) ||
 	    !mce_available(c))
 		return;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ