[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YhegvWKq913TEd0M@zn.tnic>
Date: Thu, 24 Feb 2022 16:14:05 +0100
From: Borislav Petkov <bp@...en8.de>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>,
x86@...nel.org, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org, "H . Peter Anvin" <hpa@...or.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Yazen Ghannam <yazen.ghannam@....com>
Subject: Re: [PATCH 1/2] x86/mce: Remove old CMCI storm mitigation code
On Thu, Feb 17, 2022 at 09:35:52AM -0800, Luck, Tony wrote:
> When a "storm" of CMCI is detected this code mitigates by
> disabling CMCI interrupt signalling from all of the banks
> owned by the CPU that saw the storm.
>
> There are problems with this approach:
>
> 1) It is very coarse grained. In all liklihood only one of the
Unknown word [liklihood] in commit message.
Suggestions: ['likelihood', 'livelihood']
> banks was geenrating the interrupts, but CMCI is disabled for all.
Unknown word [geenrating] in commit message.
Suggestions: ['generating', 'penetrating', 'germinating', 'entreating', 'ingratiating']
Do I need to give you the whole spiel about using a spellchecker?
:)
> This means Linux may delay seeing and processing errors logged
> from other banks.
>
> 2) Although CMCI stands for Corrected Machine Check Interrupt, it
> is also used to signal when an uncorrected error is logged. This
> is a problem because these errors should be handled in a timely
> manner.
>
> Delete all this code in preparation for a finer grained solution.
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
> arch/x86/kernel/cpu/mce/core.c | 20 +---
> arch/x86/kernel/cpu/mce/intel.c | 145 -----------------------------
> arch/x86/kernel/cpu/mce/internal.h | 6 --
> 3 files changed, 1 insertion(+), 170 deletions(-)
Yah, can't complain about diffstats like that.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists