[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YYGXu9KxCq2+wlQL@agluck-desk2.amr.corp.intel.com>
Date: Tue, 2 Nov 2021 12:55:39 -0700
From: "Luck, Tony" <tony.luck@...el.com>
To: Dave Jones <davej@...emonkey.org.uk>,
Linux Kernel <linux-kernel@...r.kernel.org>
Cc: Borislav Petkov <bp@...en8.de>
Subject: Re: mce: Add errata workaround for Skylake SKX37
On Fri, Oct 29, 2021 at 04:57:59PM -0400, Dave Jones wrote:
> Errata SKX37 is word-for-word identical to the other errata listed in
> this workaround. I happened to notice this after investigating a CMCI
> storm on a Skylake host. While I can't confirm this was the root cause,
> spurious corrected errors does sound like a likely suspect.
>
> Signed-off-by: Dave Jones <davej@...emonkey.org.uk>
Needs:
Fixes: 2976908e4198 ("x86/mce: Do not log spurious corrected mce errors")
Cc: <stable@...r.kernel.org>
otherwise:
Reviewed-by: Tony Luck <tony.luck@...el.com>
>
> diff --git arch/x86/kernel/cpu/mce/intel.c arch/x86/kernel/cpu/mce/intel.c
> index acfd5d9f93c6..bb9a46a804bf 100644
> --- arch/x86/kernel/cpu/mce/intel.c
> +++ arch/x86/kernel/cpu/mce/intel.c
> @@ -547,12 +547,13 @@ bool intel_filter_mce(struct mce *m)
> {
> struct cpuinfo_x86 *c = &boot_cpu_data;
>
> - /* MCE errata HSD131, HSM142, HSW131, BDM48, and HSM142 */
> + /* MCE errata HSD131, HSM142, HSW131, BDM48, HSM142 and SKX37 */
> if ((c->x86 == 6) &&
> ((c->x86_model == INTEL_FAM6_HASWELL) ||
> (c->x86_model == INTEL_FAM6_HASWELL_L) ||
> (c->x86_model == INTEL_FAM6_BROADWELL) ||
> - (c->x86_model == INTEL_FAM6_HASWELL_G)) &&
> + (c->x86_model == INTEL_FAM6_HASWELL_G) ||
> + (c->x86_model == INTEL_FAM6_SKYLAKE_X)) &&
> (m->bank == 0) &&
> ((m->status & 0xa0000000ffffffff) == 0x80000000000f0005))
> return true;
Powered by blists - more mailing lists