[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110415140445.GA4883@alberich.amd.com>
Date: Fri, 15 Apr 2011 16:04:45 +0200
From: Andreas Herrmann <herrmann.der.user@...glemail.com>
To: Joerg Roedel <joro@...tes.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>, Yinghai Lu <yinghai@...nel.org>,
Ingo Molnar <mingo@...e.hu>,
Alex Deucher <alexdeucher@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
Thomas Gleixner <tglx@...utronix.de>,
Tejun Heo <tj@...nel.org>, alexandre.f.demers@...il.com
Subject: Re: Linux 2.6.39-rc3
On Fri, Apr 15, 2011 at 03:11:52PM +0200, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote:
> > we definitely want to also understand the reason for things not
> > working, even if we do revert..
>
> Okay, here it is.
>
> After experimenting with different configurations for the north-bridge
> it turned out that a GART related MCE fires at the time the machine
> reboots. BIOSes configure the machine to sync-flood in that case which
> causes a reboot.
>
> After decoding the MCE it turned out to be a GART TBL Wlk Error. Such
> errors can happen if devices (speculativly) access GART ranges mapped
> invalid. The AMD BKDG for Fam10h CPUs recommends to disable these errors
> at all. But unfortunatly some BIOSes (including the one on my laptop)
> forget to do this.
>
> Below is a patch which disables these errors if the BIOS didn't do it.
> It fixes the problem on my site.
>
> Alexandre, can you try this patch on your machine too, please?
>
> Regards,
>
> Joerg
>
> From aaacff8db50b6ed4345e337ecbe53e505699c7e5 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joerg.roedel@....com>
> Date: Fri, 15 Apr 2011 14:47:40 +0200
> Subject: [PATCH] x86/amd: Disable GartTlbWlkErr when BIOS forgets it
>
> This patch disables GartTlbWlk errors on AMD Fam10h CPUs if
> the BIOS forgets to do is (or is just too old). Letting
> these errors enabled can cause a sync-flood on the CPU
> causing a reboot.
>
> This patch is the fix for
>
> https://bugzilla.kernel.org/show_bug.cgi?id=33012
>
> on my machine.
>
> Signed-off-by: Joerg Roedel <joerg.roedel@....com>
Joerg,
What about tagging this patch for stable/longterm releases?
Potentially there are other cases where certain combinations of
hardware(GPUs)/drivers/whatsoever might trigger a GartTlbWlkErr. If
the BIOS doesn't follow the BKDG recommendation to mask these errors,
the system will hang/reboot. Thus I think having this quirk in .32 and
.38 (at least) is useful.
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists