[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240301155034.GAZeH5StN80cO15qq_@fat_crate.local>
Date: Fri, 1 Mar 2024 16:50:34 +0100
From: Borislav Petkov <bp@...en8.de>
To: Yazen Ghannam <yazen.ghannam@....com>
Cc: tony.luck@...el.com, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org, avadhut.naik@....com,
john.allen@....com, muralidhara.mk@....com, sathyapriya.k@....com,
naveenkrishna.chatradhi@....com
Subject: Re: [PATCH v2 2/3] RAS/AMD/FMPM: Save SPA values
On Fri, Mar 01, 2024 at 08:37:47AM -0600, Yazen Ghannam wrote:
> The system physical address (SPA) of an error is not a stable value. It
> will change depending on the location of the memory: parts can be
> swapped. And it will change depending on memory topology: NUMA nodes
> and/or interleaving can be adjusted.
>
> Therefore, the SPA value is not part of the "FRU Memory Poison" record
> format. And it will not be saved to persistent storage.
>
> However, the SPA values can be helpful during debug and for system
> admins during run time.
>
> Save the SPA values in a separate structure. This is updated when
> records are restored and when new errors are saved.
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
> ---
> Link:
> https://lore.kernel.org/r/20240226152941.2615007-3-yazen.ghannam@amd.com
>
> v1->v2:
> * Changed variable names to remove "sys_" prefix. (Boris)
> * Used "spa_" prefix to highlight that these are for SPA values. (Yazen)
> * Added warning to "index out-of-bound" condition. (Boris)
> * Reworked save_spa() flow to get a valid array position before saving
> SPA value (Yazen).
>
> drivers/ras/amd/fmpm.c | 68 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 68 insertions(+)
Fixups ontop:
---
diff --git a/drivers/ras/amd/fmpm.c b/drivers/ras/amd/fmpm.c
index a7bb36eb60cb..8c3188488673 100644
--- a/drivers/ras/amd/fmpm.c
+++ b/drivers/ras/amd/fmpm.c
@@ -125,7 +125,7 @@ static u64 *spa_entries;
0x12, 0x0a, 0x44, 0x58)
/**
- * DOC: fru_poison_entries (byte)
+ * DOC: max_nr_entries (byte)
* Maximum number of descriptor entries possible for each FRU.
*
* Values between '1' and '255' are valid.
@@ -285,10 +285,12 @@ static void save_spa(struct fru_rec *rec, unsigned int entry,
unsigned long spa;
if (entry >= max_nr_entries) {
- pr_warn_once("entry out-of-bounds\n");
+ pr_warn_once("FRU descriptor entry %d out-of-bounds (max: %d)\n",
+ entry, max_nr_entries);
return;
}
+ /* spa_nr_entries is always multiple of max_nr_entries */
for (i = 0; i < spa_nr_entries; i += max_nr_entries) {
fru_idx = i / max_nr_entries;
if (fru_records[fru_idx] == rec)
@@ -296,7 +298,7 @@ static void save_spa(struct fru_rec *rec, unsigned int entry,
}
if (i >= spa_nr_entries) {
- pr_warn_once("record not found");
+ pr_warn_once("FRU record %d not found\n", i);
return;
}
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists