[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251121190542.2447913-2-avadhut.naik@amd.com>
Date: Fri, 21 Nov 2025 19:04:04 +0000
From: Avadhut Naik <avadhut.naik@....com>
To: <x86@...nel.org>
CC: <bp@...en8.de>, <gregkh@...uxfoundation.org>, <yazen.ghannam@....com>,
<tony.luck@...el.com>, <qiuxu.zhuo@...el.com>,
<Smita.KoralahalliChannabasappa@....com>, <linux-kernel@...r.kernel.org>,
<avadhut.naik@....com>
Subject: [PATCH 1/2] x86/mce: Do not clear bank's poll bit in mce_poll_banks on AMD SMCA systems
Currently, when a CMCI storm, detected on a Machine Check bank, subsides,
the bank's corresponding bit in the mce_poll_banks per-CPU variable is
cleared unconditionally through cmci_storm_end().
On AMD SMCA systems, this essentially disables polling on that particular
bank on that CPU. Consequently, any subsequent correctable errors or
storms will not be logged.
Since AMD SMCA systems allow banks to be managed by both polling and
interrupts, the polling banks bitmap for a CPU, i.e., mce_poll_banks,
should not be modified when a storm subsides.
Fixes: 7eae17c4add5 ("x86/mce: Add per-bank CMCI storm mitigation")
Cc: stable@...r.kernel.org
Signed-off-by: Avadhut Naik <avadhut.naik@....com>
---
arch/x86/kernel/cpu/mce/threshold.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/threshold.c b/arch/x86/kernel/cpu/mce/threshold.c
index eebaa633df80..f19dd5bc2969 100644
--- a/arch/x86/kernel/cpu/mce/threshold.c
+++ b/arch/x86/kernel/cpu/mce/threshold.c
@@ -98,7 +98,8 @@ void cmci_storm_end(unsigned int bank)
{
struct mca_storm_desc *storm = this_cpu_ptr(&storm_desc);
- __clear_bit(bank, this_cpu_ptr(mce_poll_banks));
+ if (!mce_flags.amd_threshold)
+ __clear_bit(bank, this_cpu_ptr(mce_poll_banks));
storm->banks[bank].history = 0;
storm->banks[bank].in_storm_mode = false;
--
2.43.0
Powered by blists - more mailing lists