[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251006151731.1885098-1-yazen.ghannam@amd.com>
Date: Mon, 6 Oct 2025 15:17:31 +0000
From: Yazen Ghannam <yazen.ghannam@....com>
To: <bp@...en8.de>, <tony.luck@...el.com>, <linux-edac@...r.kernel.org>
CC: <linux-kernel@...r.kernel.org>, <avadhut.naik@....com>,
<john.allen@....com>, Yazen Ghannam <yazen.ghannam@....com>
Subject: [PATCH] RAS/AMD/FMPM: Add option to ignore CEs
Generally, FMPM will handle all memory errors as it is expected that
"upstream" entities, like hardware thresholding or other Linux notifier
blocks, will filter out errors.
However, some users prefer that correctable errors are not filtered out
but only that FMPM does not take action on them.
Add a module parameter to ignore correctable errors.
When set, FMPM will not retire memory nor will it save FRU records for
correctable errors.
Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
---
drivers/ras/amd/fmpm.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/ras/amd/fmpm.c b/drivers/ras/amd/fmpm.c
index 8877c6ff64c4..08b16a133f20 100644
--- a/drivers/ras/amd/fmpm.c
+++ b/drivers/ras/amd/fmpm.c
@@ -129,6 +129,14 @@ static struct dentry *fmpm_dfs_entries;
GUID_INIT(0x5e4706c1, 0x5356, 0x48c6, 0x93, 0x0b, 0x52, 0xf2, \
0x12, 0x0a, 0x44, 0x58)
+/**
+ * DOC: ignore_ce (bool)
+ * Switch to handle or ignore correctable errors.
+ */
+static bool ignore_ce;
+module_param(ignore_ce, bool, 0644);
+MODULE_PARM_DESC(ignore_ce, "Ignore correctable errors");
+
/**
* DOC: max_nr_entries (byte)
* Maximum number of descriptor entries possible for each FRU.
@@ -413,6 +421,9 @@ static int fru_handle_mem_poison(struct notifier_block *nb, unsigned long val, v
if (!mce_is_memory_error(m))
return NOTIFY_DONE;
+ if (ignore_ce && mce_is_correctable(m))
+ return NOTIFY_DONE;
+
retire_dram_row(m->addr, m->ipid, m->extcpu);
/*
base-commit: fd94619c43360eb44d28bd3ef326a4f85c600a07
--
2.51.0
Powered by blists - more mailing lists