lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110415140445.GA4883@alberich.amd.com>
Date:	Fri, 15 Apr 2011 16:04:45 +0200
From:	Andreas Herrmann <herrmann.der.user@...glemail.com>
To:	Joerg Roedel <joro@...tes.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	"H. Peter Anvin" <hpa@...or.com>, Yinghai Lu <yinghai@...nel.org>,
	Ingo Molnar <mingo@...e.hu>,
	Alex Deucher <alexdeucher@...il.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tejun Heo <tj@...nel.org>, alexandre.f.demers@...il.com
Subject: Re: Linux 2.6.39-rc3

On Fri, Apr 15, 2011 at 03:11:52PM +0200, Joerg Roedel wrote:
> On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote:
> >  we definitely want to also understand the reason for things not
> > working, even if we do revert..
> 
> Okay, here it is.
> 
> After experimenting with different configurations for the north-bridge
> it turned out that a GART related MCE fires at the time the machine
> reboots. BIOSes configure the machine to sync-flood in that case which
> causes a reboot.
> 
> After decoding the MCE it turned out to be a GART TBL Wlk Error. Such
> errors can happen if devices (speculativly) access GART ranges mapped
> invalid. The AMD BKDG for Fam10h CPUs recommends to disable these errors
> at all. But unfortunatly some BIOSes (including the one on my laptop)
> forget to do this.
> 
> Below is a patch which disables these errors if the BIOS didn't do it.
> It fixes the problem on my site.
> 
> Alexandre, can you try this patch on your machine too, please?
> 
> Regards,
> 
> 	Joerg
> 
> From aaacff8db50b6ed4345e337ecbe53e505699c7e5 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joerg.roedel@....com>
> Date: Fri, 15 Apr 2011 14:47:40 +0200
> Subject: [PATCH] x86/amd: Disable GartTlbWlkErr when BIOS forgets it
> 
> This patch disables GartTlbWlk errors on AMD Fam10h CPUs if
> the BIOS forgets to do is (or is just too old). Letting
> these errors enabled can cause a sync-flood on the CPU
> causing a reboot.
> 
> This patch is the fix for
> 
> 	https://bugzilla.kernel.org/show_bug.cgi?id=33012
> 
> on my machine.
> 
> Signed-off-by: Joerg Roedel <joerg.roedel@....com>


Joerg,

What about tagging this patch for stable/longterm releases?

Potentially there are other cases where certain combinations of
hardware(GPUs)/drivers/whatsoever might trigger a GartTlbWlkErr. If
the BIOS doesn't follow the BKDG recommendation to mask these errors,
the system will hang/reboot. Thus I think having this quirk in .32 and
.38 (at least) is useful.


Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ