lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251006151731.1885098-1-yazen.ghannam@amd.com>
Date: Mon, 6 Oct 2025 15:17:31 +0000
From: Yazen Ghannam <yazen.ghannam@....com>
To: <bp@...en8.de>, <tony.luck@...el.com>, <linux-edac@...r.kernel.org>
CC: <linux-kernel@...r.kernel.org>, <avadhut.naik@....com>,
	<john.allen@....com>, Yazen Ghannam <yazen.ghannam@....com>
Subject: [PATCH] RAS/AMD/FMPM: Add option to ignore CEs

Generally, FMPM will handle all memory errors as it is expected that
"upstream" entities, like hardware thresholding or other Linux notifier
blocks, will filter out errors.

However, some users prefer that correctable errors are not filtered out
but only that FMPM does not take action on them.

Add a module parameter to ignore correctable errors.

When set, FMPM will not retire memory nor will it save FRU records for
correctable errors.

Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
---
 drivers/ras/amd/fmpm.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/ras/amd/fmpm.c b/drivers/ras/amd/fmpm.c
index 8877c6ff64c4..08b16a133f20 100644
--- a/drivers/ras/amd/fmpm.c
+++ b/drivers/ras/amd/fmpm.c
@@ -129,6 +129,14 @@ static struct dentry *fmpm_dfs_entries;
 	GUID_INIT(0x5e4706c1, 0x5356, 0x48c6, 0x93, 0x0b, 0x52, 0xf2,	\
 		  0x12, 0x0a, 0x44, 0x58)
 
+/**
+ * DOC: ignore_ce (bool)
+ * Switch to handle or ignore correctable errors.
+ */
+static bool ignore_ce;
+module_param(ignore_ce, bool, 0644);
+MODULE_PARM_DESC(ignore_ce, "Ignore correctable errors");
+
 /**
  * DOC: max_nr_entries (byte)
  * Maximum number of descriptor entries possible for each FRU.
@@ -413,6 +421,9 @@ static int fru_handle_mem_poison(struct notifier_block *nb, unsigned long val, v
 	if (!mce_is_memory_error(m))
 		return NOTIFY_DONE;
 
+	if (ignore_ce && mce_is_correctable(m))
+		return NOTIFY_DONE;
+
 	retire_dram_row(m->addr, m->ipid, m->extcpu);
 
 	/*

base-commit: fd94619c43360eb44d28bd3ef326a4f85c600a07
-- 
2.51.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ