[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <200808131209.57534.mark.langsdorf@amd.com>
Date: Wed, 13 Aug 2008 12:09:57 -0500
From: Mark Langsdorf <mark.langsdorf@....com>
To: Ingo Molnar <mingo@...e.hu>
CC: linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: invalidate caches before going into suspend
On Wednesday 13 August 2008, Ingo Molnar wrote:
>
> * Mark Langsdorf <mark.langsdorf@....com> wrote:
>
> > When a CPU core is shut down, all of its caches need to be flushed to
> > prevent stale data from causing errors if the core is resumed. Current
> > Linux suspend code performs an assignment after the flush, which can
> > add dirty data back to the cache. On some AMD platforms, additional
> > speculative reads have caused crashes on resume because of this dirty
> > data.
> >
> > Relocate the cache flush to be the very last thing done before
> > halting.
>
> nice catch! Applied to x86/urgent.
>
> I'm really curious: how did you find this bug? Did you see a CPU come up
> as !CPU_DEAD?
AMD's diagnostic code for new CPUs was hanging when coming out of suspend,
so I presume it was hitting a bug check for not !CPU_DEAD. I got the
debug lab reports second hand. They traced the root cause to dirty data
being preserved in the cache and suggested relocating the wbinvd().
> please send a patch for the 32-bit side too, it has the same bug.
>
> also, we might be safer if the wbinvd(), the CLI and the halt was in a
> single assembly sequence:
> to make sure the compiler doesnt ever insert something into this
> codepath? [ And note the double cli which would be further
> robustification - in theory we could get a spurious interrupt straight
> after the wbinvd. ] Hm?
I don't think it's necessary. I can submit a delta patch later if you
think it's really necessary.
Signed-off-by: Mark Langsdorf <mark.langsdorf@....com>
diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
--- a/arch/x86/kernel/process_32.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_32.c Wed Aug 13 06:40:00 2008 -0500
@@ -95,11 +95,11 @@ static inline void play_dead(void)
{
/* This must be done before dead CPU ack */
cpu_exit_clear();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;
+ wbinvd();
/*
* With physical CPU hotplug, we should halt the cpu
*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists