[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250320180450.539-1-shiju.jose@huawei.com>
Date: Thu, 20 Mar 2025 18:04:37 +0000
From: <shiju.jose@...wei.com>
To: <linux-cxl@...r.kernel.org>, <dan.j.williams@...el.com>,
<dave@...olabs.net>, <jonathan.cameron@...wei.com>, <dave.jiang@...el.com>,
<alison.schofield@...el.com>, <vishal.l.verma@...el.com>,
<ira.weiny@...el.com>, <david@...hat.com>, <Vilas.Sridharan@....com>
CC: <linux-edac@...r.kernel.org>, <linux-acpi@...r.kernel.org>,
<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>, <bp@...en8.de>,
<tony.luck@...el.com>, <rafael@...nel.org>, <lenb@...nel.org>,
<mchehab@...nel.org>, <leo.duran@....com>, <Yazen.Ghannam@....com>,
<rientjes@...gle.com>, <jiaqiyan@...gle.com>, <Jon.Grimm@....com>,
<dave.hansen@...ux.intel.com>, <naoya.horiguchi@....com>,
<james.morse@....com>, <jthoughton@...gle.com>, <somasundaram.a@....com>,
<erdemaktas@...gle.com>, <pgonda@...gle.com>, <duenwen@...gle.com>,
<gthelen@...gle.com>, <wschwartz@...erecomputing.com>,
<dferguson@...erecomputing.com>, <wbs@...amperecomputing.com>,
<nifan.cxl@...il.com>, <yazen.ghannam@....com>, <tanxiaofei@...wei.com>,
<prime.zeng@...ilicon.com>, <roberto.sassu@...wei.com>,
<kangkang.shen@...urewei.com>, <wanghuiqiang@...wei.com>,
<linuxarm@...wei.com>, <shiju.jose@...wei.com>
Subject: [PATCH v2 0/8] cxl: support CXL memory RAS features
From: Shiju Jose <shiju.jose@...wei.com>
Support for CXL memory RAS features: patrol scrub, ECS, soft-PPR and
memory sparing.
This CXL series was part of the EDAC series [1].
The code is based on cxl.git next branch [2] merged with ras.git edac-cxl
branch [3].
1. https://lore.kernel.org/linux-cxl/20250212143654.1893-1-shiju.jose@huawei.com/
2. https://web.git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=next
3. https://web.git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-cxl
Userspace code for CXL memory repair features [4] and
sample boot-script for CXL memory repair [5].
[4]: https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/
[5]: https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/
Changes
=======
v1 -> v2:
1. Feedbacks from Dan Williams on v1,
https://lore.kernel.org/linux-mm/20250307091137.00006a0a@huawei.com/T/
- Fixed lock issues in region scrubbing, added local cxl_acquire()
and cxl_unlock.
- Replaced CXL examples using cat and echo from EDAC .rst docs
with short description and ref to ABI docs. Also corrections
in existing descriptions as suggested by Dan.
- Add policy description for the scrub control feature.
However this may require inputs from CXL experts.
- Replaced CONFIG_CXL_RAS_FEATURES with CONFIG_CXL_EDAC_MEM_FEATURES.
- Few changes to depends part of CONFIG_CXL_EDAC_MEM_FEATURES.
- Rename drivers/cxl/core/memfeatures.c as drivers/cxl/core/edac.c
- snprintf() -> kasprintf() in few places.
2. Feedbacks from Alison on v1,
- In cxl_get_feature_entry()(patch 1), return NULL on failures and
reintroduced checks in cxl_get_feature_entry().
- Changed logic in for loop in region based scrubbing code.
- Replace cxl_are_decoders_committed() to cxl_is_memdev_memory_online()
and add as a local function to drivers/cxl/core/edac.c
- Changed few multiline comments to single line comments.
- Removed unnecessary comments from the code.
- Reduced line length of few macros in ECS and memory repair code.
- In new files, changed "GPL-2.0-or-later" -> "GPL-2.0-only".
- Ran clang-format for new files and updated.
3. Changes for feedbacks from Jonathan on v1.
- Changed few multiline comments to single line comments.
Shiju Jose (8):
cxl: Add helper function to retrieve a feature entry
EDAC: Update documentation for the CXL memory patrol scrub control
feature
cxl/edac: Add CXL memory device patrol scrub control feature
cxl/edac: Add CXL memory device ECS control feature
cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command
cxl: Support for finding memory operation attributes from the current
boot
cxl/memfeature: Add CXL memory device soft PPR control feature
cxl/memfeature: Add CXL memory device memory sparing control feature
Documentation/edac/memory_repair.rst | 31 +
Documentation/edac/scrub.rst | 47 +
drivers/cxl/Kconfig | 27 +
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/core.h | 2 +
drivers/cxl/core/edac.c | 1730 ++++++++++++++++++++++++++
drivers/cxl/core/features.c | 23 +
drivers/cxl/core/mbox.c | 45 +-
drivers/cxl/core/memdev.c | 9 +
drivers/cxl/core/ras.c | 145 +++
drivers/cxl/core/region.c | 5 +
drivers/cxl/cxlmem.h | 73 ++
drivers/cxl/mem.c | 4 +
drivers/cxl/pci.c | 3 +
drivers/edac/mem_repair.c | 9 +
include/linux/edac.h | 7 +
16 files changed, 2159 insertions(+), 2 deletions(-)
create mode 100644 drivers/cxl/core/edac.c
--
2.43.0
Powered by blists - more mailing lists