lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240214033516.1344948-1-yazen.ghannam@amd.com>
Date: Tue, 13 Feb 2024 21:35:14 -0600
From: Yazen Ghannam <yazen.ghannam@....com>
To: <bp@...en8.de>, <tony.luck@...el.com>, <linux-edac@...r.kernel.org>
CC: <linux-kernel@...r.kernel.org>, <avadhut.naik@....com>,
	<john.allen@....com>, <muralidhara.mk@....com>,
	<naveenkrishna.chatradhi@....com>, <sathyapriya.k@....com>, Yazen Ghannam
	<yazen.ghannam@....com>
Subject: [PATCH 0/2] FRU Memory Poison Manager

Hi all,

This set adds a new module to manage error records on persistent
storage.

Patch 1 moves a function from AMD64 EDAC to the AMD Address Translation
Library. This is needed for patch 2.

Patch 2 adds the new module. This is a near total rewrite based on patch
2 from the following set:
https://lore.kernel.org/r/20231129075034.2159223-1-muralimk@amd.com

I included questions in code comments where I think more attention is
needed.

I'd like to add Murali and Naveen as Co-developers, since this is based
on their work. Also, I kept Naveen as a maintainer in case he's still
interested.

Regarding the old set:
 * Patch 1 exports a new function from the ERST driver. This is not
   necessary.

 * Patch 3 adds a new sysfs interface. This needs more work.

 * Patch 4 old set adds documentation. This needs updating.

I did some basic testing on a 2P server system without ERST support.
Mostly I tried to check out the memory layout of the structures. And I
did some memory error injections to check out the record updating flow.
I did some fixups after testing, so I apologize if I missed anything.

Thanks,
Yazen

Yazen Ghannam (2):
  RAS/AMD/ATL, EDAC/amd64: Move MI300 Row Retirement to ATL
  RAS: Introduce the FRU Memory Poison Manager

 MAINTAINERS                 |   7 +
 drivers/edac/Kconfig        |   1 -
 drivers/edac/amd64_edac.c   |  48 ---
 drivers/ras/Kconfig         |  13 +
 drivers/ras/Makefile        |   1 +
 drivers/ras/amd/atl/Kconfig |   1 +
 drivers/ras/amd/atl/umc.c   |  51 +++
 drivers/ras/amd/fmpm.c      | 776 ++++++++++++++++++++++++++++++++++++
 include/linux/ras.h         |   2 +
 9 files changed, 851 insertions(+), 49 deletions(-)
 create mode 100644 drivers/ras/amd/fmpm.c


base-commit: c2064388aa8765abd7c2c5785e7bfe266a2f6cd3
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ