[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 12 Dec 2008 20:06:21 +0100
From: Andi Kleen <andi@...stfloor.org>
To: Andreas Herrmann <andreas.herrmann3@....com>
Cc: Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] x86: re-enable MCE on secondary CPUS after suspend/resume
Andreas Herrmann <andreas.herrmann3@....com> writes:
> Impact: fix suspend/resume bug with MCE
>
> After suspend/resume MCx_CTL registers of secondary CPUs are cleared.
> (At least that's what I've observed on several systems.)
> Linux currently only re-initializes MCE on the boot CPU - see mce_resume().
> Thus after suspend/resume we end up with a system where MCE is active
> on the boot CPU but switched off on all other CPUs.
>
> By calling mce_init() whenever a CPU comes online this problem is
> solved.
Can you double check that please?
Suspend/resume are supposted to hotunplug all CPUs except the BP and
then re-online them on resume (with "disable_nonboot_cpus()) . The
re-online initializes MCEs in the standard CPU bootup path.
A good way is to stick a WARN_ON(num_online_cpus() > 1) into
mce_suspend(). I had that here for some time and didn't see
it trigger.
I got a couple of suspend bug fixes in my mce improvement tree, see:
http://git.kernel.org/?p=linux/kernel/git/mingo/linux-2.6-x86.git;a=history;f=arch/x86/kernel/cpu/mcheck/mce_64.c;h=9512a7eab4e7b03a584f5bb647bd242bd4c003dc;hb=x86/mce
During review it was decided to all defer it to .29 though.
-Andi
--
ak@...ux.intel.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists