[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240214033516.1344948-1-yazen.ghannam@amd.com>
Date: Tue, 13 Feb 2024 21:35:14 -0600
From: Yazen Ghannam <yazen.ghannam@....com>
To: <bp@...en8.de>, <tony.luck@...el.com>, <linux-edac@...r.kernel.org>
CC: <linux-kernel@...r.kernel.org>, <avadhut.naik@....com>,
<john.allen@....com>, <muralidhara.mk@....com>,
<naveenkrishna.chatradhi@....com>, <sathyapriya.k@....com>, Yazen Ghannam
<yazen.ghannam@....com>
Subject: [PATCH 0/2] FRU Memory Poison Manager
Hi all,
This set adds a new module to manage error records on persistent
storage.
Patch 1 moves a function from AMD64 EDAC to the AMD Address Translation
Library. This is needed for patch 2.
Patch 2 adds the new module. This is a near total rewrite based on patch
2 from the following set:
https://lore.kernel.org/r/20231129075034.2159223-1-muralimk@amd.com
I included questions in code comments where I think more attention is
needed.
I'd like to add Murali and Naveen as Co-developers, since this is based
on their work. Also, I kept Naveen as a maintainer in case he's still
interested.
Regarding the old set:
* Patch 1 exports a new function from the ERST driver. This is not
necessary.
* Patch 3 adds a new sysfs interface. This needs more work.
* Patch 4 old set adds documentation. This needs updating.
I did some basic testing on a 2P server system without ERST support.
Mostly I tried to check out the memory layout of the structures. And I
did some memory error injections to check out the record updating flow.
I did some fixups after testing, so I apologize if I missed anything.
Thanks,
Yazen
Yazen Ghannam (2):
RAS/AMD/ATL, EDAC/amd64: Move MI300 Row Retirement to ATL
RAS: Introduce the FRU Memory Poison Manager
MAINTAINERS | 7 +
drivers/edac/Kconfig | 1 -
drivers/edac/amd64_edac.c | 48 ---
drivers/ras/Kconfig | 13 +
drivers/ras/Makefile | 1 +
drivers/ras/amd/atl/Kconfig | 1 +
drivers/ras/amd/atl/umc.c | 51 +++
drivers/ras/amd/fmpm.c | 776 ++++++++++++++++++++++++++++++++++++
include/linux/ras.h | 2 +
9 files changed, 851 insertions(+), 49 deletions(-)
create mode 100644 drivers/ras/amd/fmpm.c
base-commit: c2064388aa8765abd7c2c5785e7bfe266a2f6cd3
--
2.34.1
Powered by blists - more mailing lists