[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250430012915.295136-1-lizhijian@fujitsu.com>
Date: Wed, 30 Apr 2025 09:29:15 +0800
From: Li Zhijian <lizhijian@...itsu.com>
To: linux-cxl@...r.kernel.org
Cc: Jonathan Cameron <jonathan.cameron@...wei.com>,
Dave Jiang <dave.jiang@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Ira Weiny <ira.weiny@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
linux-kernel@...r.kernel.org,
Li Zhijian <lizhijian@...itsu.com>
Subject: [RFC PATCH] cxl: Allow reprogramming misconfigured hdm decoders
During kernel booting, CXL drivers will attempt to construct the CXL region
according to the pre-programed(firmware provisioning) HDM decoders.
This construction process will fail for some reasons, in this case, the
userspace cli like ndctl/cxl cannot destroy nor create regions upon the
existing decoders.
Introuce a new flag CXL_DECODER_F_NEED_RESET tell the driver to reset
the decoder during `cxl destroy-region regionN`, so that region can be
create again after that.
Signed-off-by: Li Zhijian <lizhijian@...itsu.com>
---
Cover letter is here.
Hi all,
Previously, we encountered a CXL device with differing IG between the device
and its USP, which resulted in the driver’s failure to automatically
complete the creation of the region, and user cannot create/destroy region
after this failure.
Although this IG issue has been ignored in the new kernel [1], I realized
that if there was an error in firmware provisioning the HDM decoders, the
user might not be able to destroy the unfinished region and recreate it.
This is because certain components related to this region are in an
inconsistent state, preventing the CXL tool from operating on it.
This implies that the OS administrator cannot reconfigure the CXL device,
which is largely contrary to user expectations. I am keen to hear your
thoughts on this matter.
A modified QEMU [2] is able to limitly program the HDM decoders to inject some
wrong decoder configurations.
[1] https://lore.kernel.org/lkml/20250402232552.999634-1-gourry@gourry.net/
[2] https://github.com/zhijianli88/qemu/tree/program-decoder
---
---
drivers/cxl/core/hdm.c | 4 +++-
drivers/cxl/core/region.c | 6 ++++--
drivers/cxl/cxl.h | 3 ++-
tools/testing/cxl/test/cxl.c | 1 +
4 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 70cae4ebf8a4..afbbda780d4d 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -559,7 +559,8 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
dev_name(&cxled->cxld.region->dev));
return -EBUSY;
}
- if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
+ if (cxled->cxld.flags & CXL_DECODER_F_ENABLE &&
+ !(cxled->cxld.flags & CXL_DECODER_F_NEED_RESET)) {
dev_dbg(dev, "decoder enabled\n");
return -EBUSY;
}
@@ -918,6 +919,7 @@ static void cxl_decoder_reset(struct cxl_decoder *cxld)
up_read(&cxl_dpa_rwsem);
cxld->flags &= ~CXL_DECODER_F_ENABLE;
+ cxld->flags &= ~CXL_DECODER_F_NEED_RESET;
/* Userspace is now responsible for reconfiguring this decoder */
if (is_endpoint_decoder(&cxld->dev)) {
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c3f4dc244df7..d025e892d07d 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2096,7 +2096,8 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
p = &cxlr->params;
get_device(&cxlr->dev);
- if (p->state > CXL_CONFIG_ACTIVE) {
+ if (p->state > CXL_CONFIG_ACTIVE ||
+ cxled->cxld.flags & CXL_DECODER_F_NEED_RESET) {
cxl_region_decode_reset(cxlr, p->interleave_ways);
p->state = CXL_CONFIG_ACTIVE;
}
@@ -3434,7 +3435,8 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
if (device_attach(&cxlr->dev) < 0)
dev_err(&cxlr->dev, "failed to enable, range: %pr\n",
p->res);
- }
+ } else
+ cxled->cxld.flags |= CXL_DECODER_F_NEED_RESET;
put_device(region_dev);
out:
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index be8a7dc77719..60fae072bbcf 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -331,7 +331,8 @@ int cxl_dport_map_rcd_linkcap(struct pci_dev *pdev, struct cxl_dport *dport);
#define CXL_DECODER_F_TYPE3 BIT(3)
#define CXL_DECODER_F_LOCK BIT(4)
#define CXL_DECODER_F_ENABLE BIT(5)
-#define CXL_DECODER_F_MASK GENMASK(5, 0)
+#define CXL_DECODER_F_NEED_RESET BIT(6)
+#define CXL_DECODER_F_MASK GENMASK(6, 0)
enum cxl_decoder_type {
CXL_DECODER_DEVMEM = 2,
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 1c3336095923..5dac454ca0e2 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -709,6 +709,7 @@ static void mock_decoder_reset(struct cxl_decoder *cxld)
"%s: out of order reset, expected decoder%d.%d\n",
dev_name(&cxld->dev), port->id, port->commit_end);
cxld->flags &= ~CXL_DECODER_F_ENABLE;
+ cxld->flags &= ~CXL_DECODER_F_NEED_RESET;
}
static void default_mock_decoder(struct cxl_decoder *cxld)
--
2.27.0
Powered by blists - more mailing lists