[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <200808131209.57534.mark.langsdorf@amd.com>
Date:	Wed, 13 Aug 2008 12:09:57 -0500
From:	Mark Langsdorf <mark.langsdorf@....com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: invalidate caches before going into suspend
On Wednesday 13 August 2008, Ingo Molnar wrote:
> 
> * Mark Langsdorf <mark.langsdorf@....com> wrote:
> 
> > When a CPU core is shut down, all of its caches need to be flushed to 
> > prevent stale data from causing errors if the core is resumed. Current 
> > Linux suspend code performs an assignment after the flush, which can 
> > add dirty data back to the cache.  On some AMD platforms, additional 
> > speculative reads have caused crashes on resume because of this dirty 
> > data.
> > 
> > Relocate the cache flush to be the very last thing done before 
> > halting.
> 
> nice catch! Applied to x86/urgent.
> 
> I'm really curious: how did you find this bug? Did you see a CPU come up 
> as !CPU_DEAD?
AMD's diagnostic code for new CPUs was hanging when coming out of suspend,
so I presume it was hitting a bug check for not !CPU_DEAD.  I got the
debug lab reports second hand.  They traced the root cause to dirty data
being preserved in the cache and suggested relocating the wbinvd().
> please send a patch for the 32-bit side too, it has the same bug.
> 
> also, we might be safer if the wbinvd(), the CLI and the halt was in a 
> single assembly sequence:
> to make sure the compiler doesnt ever insert something into this 
> codepath? [ And note the double cli which would be further 
> robustification - in theory we could get a spurious interrupt straight 
> after the wbinvd. ] Hm?
I don't think it's necessary.  I can submit a delta patch later if you
think it's really necessary.
Signed-off-by: Mark Langsdorf <mark.langsdorf@....com>
diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
--- a/arch/x86/kernel/process_32.c	Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_32.c	Wed Aug 13 06:40:00 2008 -0500
@@ -95,11 +95,11 @@ static inline void play_dead(void)
 {
 	/* This must be done before dead CPU ack */
 	cpu_exit_clear();
-	wbinvd();
 	mb();
 	/* Ack it */
 	__get_cpu_var(cpu_state) = CPU_DEAD;
 
+	wbinvd();
 	/*
 	 * With physical CPU hotplug, we should halt the cpu
 	 */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
