lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e3209aea-19cd-4577-b557-842176cbb7b4@amd.com>
Date: Mon, 6 Oct 2025 11:35:29 -0500
From: "Naik, Avadhut" <avadnaik@....com>
To: Yazen Ghannam <yazen.ghannam@....com>, bp@...en8.de, tony.luck@...el.com,
 linux-edac@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, avadhut.naik@....com, john.allen@....com
Subject: Re: [PATCH] RAS/AMD/FMPM: Add option to ignore CEs



On 10/6/2025 10:17, Yazen Ghannam wrote:
> Generally, FMPM will handle all memory errors as it is expected that
> "upstream" entities, like hardware thresholding or other Linux notifier
> blocks, will filter out errors.
> 
> However, some users prefer that correctable errors are not filtered out
> but only that FMPM does not take action on them.
> 
> Add a module parameter to ignore correctable errors.
> 
> When set, FMPM will not retire memory nor will it save FRU records for
> correctable errors.
> 
> Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
> ---
>  drivers/ras/amd/fmpm.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/ras/amd/fmpm.c b/drivers/ras/amd/fmpm.c
> index 8877c6ff64c4..08b16a133f20 100644
> --- a/drivers/ras/amd/fmpm.c
> +++ b/drivers/ras/amd/fmpm.c
> @@ -129,6 +129,14 @@ static struct dentry *fmpm_dfs_entries;
>  	GUID_INIT(0x5e4706c1, 0x5356, 0x48c6, 0x93, 0x0b, 0x52, 0xf2,	\
>  		  0x12, 0x0a, 0x44, 0x58)
>  
> +/**
> + * DOC: ignore_ce (bool)
> + * Switch to handle or ignore correctable errors.
> + */
> +static bool ignore_ce;
> +module_param(ignore_ce, bool, 0644);
> +MODULE_PARM_DESC(ignore_ce, "Ignore correctable errors");
> +
>  /**
>   * DOC: max_nr_entries (byte)
>   * Maximum number of descriptor entries possible for each FRU.
> @@ -413,6 +421,9 @@ static int fru_handle_mem_poison(struct notifier_block *nb, unsigned long val, v
>  	if (!mce_is_memory_error(m))
>  		return NOTIFY_DONE;
>  
> +	if (ignore_ce && mce_is_correctable(m))
> +		return NOTIFY_DONE;
> +
>  	retire_dram_row(m->addr, m->ipid, m->extcpu);
>  
>  	/*
> 
> base-commit: fd94619c43360eb44d28bd3ef326a4f85c600a07

LGTM!

Reviewed-by: Avadhut Naik <avadhut.naik@....com>

-- 
Thanks,
Avadhut Naik


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ