[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19c2413143b.107b011e88026.6653143171898793206@zohomail.com>
Date: Tue, 03 Feb 2026 23:15:56 +0800
From: Li Ming <ming.li@...omail.com>
To: "danjwilliams" <dan.j.williams@...el.com>
Cc: "dave" <dave@...olabs.net>,
"jonathan.cameron" <jonathan.cameron@...wei.com>,
"dave.jiang" <dave.jiang@...el.com>,
"alison.schofield" <alison.schofield@...el.com>,
"vishal.l.verma" <vishal.l.verma@...el.com>,
"ira.weiny" <ira.weiny@...el.com>,
"linux-cxl" <linux-cxl@...r.kernel.org>,
"linux-kernel" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
From: <dan.j.williams@...el.com>
To: "Li Ming"<ming.li@...omail.com>, <dave@...olabs.net>, <jonathan.cameron@...wei.com>, <dave.jiang@...el.com>, <alison.schofield@...el.com>, <vishal.l.verma@...el.com>, <ira.weiny@...el.com>, <dan.j.williams@...el.com>
Cc: <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>, "Li Ming"<ming.li@...omail.com>
Date: Tue, 03 Feb 2026 08:01:04 +0800
Subject: Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
> Li Ming wrote:
> > CXL testing environment can trigger following trace
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
> > RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
> > Call Trace:
> > <TASK>
> > cxl_event_trace_record+0xd1/0xa70 [cxl_core]
> > __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
> > cxl_mem_get_records_log+0x261/0x500 [cxl_core]
> > cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
> > cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
>
> Hooray for unit tests, but I wonder why this failure is so reliable for
> you and absent for me? I retried with latest cxl/next, no luck.
>
> No matter, it still looks like something worth addressing, but not with
> setting ->endpoint to NULL.
>
> > platform_probe+0x9d/0x130
> > really_probe+0x1c8/0x960
> > __driver_probe_device+0x187/0x3e0
> > driver_probe_device+0x45/0x120
> > __device_attach_driver+0x15d/0x280
> >
> > commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
> > However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
> > valid endpoint.
> >
> > Across the CXL core, endpoint availability is generally determined by
> > checking whether it is NULL. Align with this convention by initializing
> > cxlmd->endpoint to NULL by default.
> >
> > Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > Signed-off-by: Li Ming <ming.li@...omail.com>
> > ---
> > drivers/cxl/core/memdev.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index af3d0cc65138..41a507b5daa4 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -675,7 +675,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> > cxlmd->id = rc;
> > cxlmd->depth = -1;
> > cxlmd->attach = attach;
> > - cxlmd->endpoint = ERR_PTR(-ENXIO);
> > + cxlmd->endpoint = NULL;
>
> So, this does not look like a fix to me, it looks like a bug report
> against all the code paths that want to assume that a mere NULL check of
> ->endpoint is sufficient.
>
> I think this unintentional ERR_PTR() trip hazard is now a *feature* in
> retrospect to catch those cases. If something depends on ->endpoint
> being valid, it depends on ->endpoint *staying* valid.
>
> A proposed fix for this case is below (passes cxl_test), and if other
> ERR_PTR() crashes show up I expect they need similar inspection. This
> probably wants additional cleanup to consolidate boiler-plate and make
> the critical change to cxl_dpa_to_region() easier to see:
>
Hi Dan,
Thanks for your proposal, I think your change can solve this problem too.
But the change is a lot, and need more time to review all driver code to confirm
if there are other places needed such checking. (I found that cxl_reset_done also needs some changes like you mentioned)
Maybe we can consider my change as a quick fix? Then I can prepare a new patchset for the consolidation.
Ming
> -- 8< --
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 422531799af2..e652980098cf 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -39,7 +39,7 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
> int cxl_region_init(void);
> void cxl_region_exit(void);
> int cxl_get_poison_by_endpoint(struct cxl_port *port);
> -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
> +struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa);
> u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> u64 dpa);
>
> @@ -50,7 +50,7 @@ static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
> return ULLONG_MAX;
> }
> static inline
> -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
> +struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
> {
> return NULL;
> }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index ef202b34e5ea..610ed6744ddc 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -864,7 +864,7 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> unsigned long *cmds);
> void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> +void cxl_event_trace_record(struct cxl_memdev *cxlmd,
> enum cxl_event_log_type type,
> enum cxl_event_type event_type,
> const uuid_t *uuid, union cxl_event *evt);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 0be95294b6e6..4fafee80524b 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -911,6 +911,7 @@ static inline void device_unlock(struct device *dev)
> }
>
> DEFINE_GUARD(device, struct device *, device_lock(_T), device_unlock(_T))
> +DEFINE_GUARD_COND(device, _intr, device_lock_interruptible(_T), _RET == 0)
>
> static inline void device_lock_assert(struct device *dev)
> {
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index fa6dd0c94656..7b923408b6c5 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -886,7 +886,7 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, "CXL");
>
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> +void cxl_event_trace_record(struct cxl_memdev *cxlmd,
> enum cxl_event_log_type type,
> enum cxl_event_type event_type,
> const uuid_t *uuid, union cxl_event *evt)
> @@ -913,6 +913,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> * translations. Take topology mutation locks and lookup
> * { HPA, REGION } from { DPA, MEMDEV } in the event record.
> */
> + guard(device)(&cxlmd->dev);
> guard(rwsem_read)(&cxl_rwsem.region);
> guard(rwsem_read)(&cxl_rwsem.dpa);
>
> @@ -961,7 +962,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> }
> EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, "CXL");
>
> -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> +static void __cxl_event_trace_record(struct cxl_memdev *cxlmd,
> enum cxl_event_log_type type,
> struct cxl_event_record_raw *record)
> {
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index af3d0cc65138..d2bee5608e50 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -331,6 +331,10 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> {
> int rc;
>
> + ACQUIRE(device_intr, devlock)(&cxlmd->dev);
> + if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
> + return rc;
> +
> ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> return rc;
> @@ -355,6 +359,7 @@ int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
> if (!IS_ENABLED(CONFIG_DEBUG_FS))
> return 0;
>
> + device_lock_assert(&cxlmd->dev);
> lockdep_assert_held(&cxl_rwsem.dpa);
> lockdep_assert_held(&cxl_rwsem.region);
>
> @@ -400,6 +405,10 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> {
> int rc;
>
> + ACQUIRE(device_intr, devlock)(&cxlmd->dev);
> + if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
> + return rc;
> +
> ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> return rc;
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 96888d87a8df..a7d391757e16 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2936,7 +2936,7 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg)
> return 1;
> }
>
> -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
> +struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
> {
> struct cxl_dpa_to_region_context ctx;
> struct cxl_port *port;
> @@ -2944,8 +2944,12 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
> ctx = (struct cxl_dpa_to_region_context) {
> .dpa = dpa,
> };
> +
> + device_lock_assert(&cxlmd->dev);
> + if (!cxlmd->dev.driver)
> + return NULL;
> port = cxlmd->endpoint;
> - if (port && is_cxl_endpoint(port) && cxl_num_decoders_committed(port))
> + if (cxl_num_decoders_committed(port))
> device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region);
>
> return ctx.cxlr;
> @@ -4004,10 +4008,9 @@ static int validate_region_offset(struct cxl_region *cxlr, u64 offset)
> return 0;
> }
>
> -static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
> +static int region_poison_lookup(struct cxl_region *cxlr, u64 offset,
> + struct dpa_result *result)
> {
> - struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
> - struct cxl_region *cxlr = data;
> int rc;
>
> ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> @@ -4022,8 +4025,8 @@ static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
> return -EINVAL;
>
> offset -= cxlr->params.cache_size;
> - rc = region_offset_to_dpa_result(cxlr, offset, &result);
> - if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
> + rc = region_offset_to_dpa_result(cxlr, offset, result);
> + if (rc || !result->cxlmd || result->dpa == ULLONG_MAX) {
> dev_dbg(&cxlr->dev,
> "Failed to resolve DPA for region offset %#llx rc %d\n",
> offset, rc);
> @@ -4031,7 +4034,43 @@ static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
> return rc ? rc : -EINVAL;
> }
>
> - return cxl_inject_poison_locked(result.cxlmd, result.dpa);
> + return 0;
> +}
> +
> +static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
> +{
> + struct dpa_result result1 = { .dpa = ULLONG_MAX };
> + struct dpa_result result2 = { .dpa = ULLONG_MAX };
> + struct cxl_region *cxlr = data;
> + struct cxl_memdev *cxlmd;
> + int rc;
> +
> + rc = region_poison_lookup(cxlr, offset, &result1);
> + if (rc)
> + return rc;
> +
> + cxlmd = result1.cxlmd;
> + ACQUIRE(device_intr, devlock)(&cxlmd->dev);
> + if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
> + return rc;
> +
> + ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> + return rc;
> +
> + ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> + return rc;
> +
> + offset -= cxlr->params.cache_size;
> + rc = region_offset_to_dpa_result(cxlr, offset, &result2);
> + if (rc || memcmp(&result1, &result2, sizeof(result1) != 0)) {
> + dev_dbg(&cxlr->dev,
> + "Error injection raced region reconfiguration: %d\n", rc);
> + return rc ? rc : -ENXIO;
> + }
> +
> + return cxl_inject_poison_locked(result1.cxlmd, result1.dpa);
> }
>
> DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
> @@ -4039,10 +4078,21 @@ DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
>
> static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
> {
> - struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
> + struct dpa_result result1 = { .dpa = ULLONG_MAX };
> + struct dpa_result result2 = { .dpa = ULLONG_MAX };
> struct cxl_region *cxlr = data;
> + struct cxl_memdev *cxlmd;
> int rc;
>
> + rc = region_poison_lookup(cxlr, offset, &result1);
> + if (rc)
> + return rc;
> +
> + cxlmd = result1.cxlmd;
> + ACQUIRE(device_intr, devlock)(&cxlmd->dev);
> + if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
> + return rc;
> +
> ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> return rc;
> @@ -4051,20 +4101,15 @@ static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
> if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> return rc;
>
> - if (validate_region_offset(cxlr, offset))
> - return -EINVAL;
> -
> offset -= cxlr->params.cache_size;
> - rc = region_offset_to_dpa_result(cxlr, offset, &result);
> - if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
> + rc = region_offset_to_dpa_result(cxlr, offset, &result2);
> + if (rc || memcmp(&result1, &result2, sizeof(result1) != 0)) {
> dev_dbg(&cxlr->dev,
> - "Failed to resolve DPA for region offset %#llx rc %d\n",
> - offset, rc);
> -
> - return rc ? rc : -EINVAL;
> + "Error clearing raced region reconfiguration: %d\n", rc);
> + return rc ? rc : -ENXIO;
> }
>
> - return cxl_clear_poison_locked(result.cxlmd, result.dpa);
> + return cxl_clear_poison_locked(result1.cxlmd, result1.dpa);
> }
>
> DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
> --
> 2.52.0
>
>
Powered by blists - more mailing lists